Variation in length of proteins by repeats and disorder regions
2013 (English)Doctoral thesis, comprehensive summary (Other academic)
Protein-coding genes evolve together with their genome and acquire changes, some of which affect the length of their protein products. This explains why equivalent proteins from different species can exhibit length differences. Variation in length of proteins during evolution arguably presents a large number of possibilities for improvement and innovation of protein structure and function. In order to contribute to an increased understanding of this process, we have studied variation caused by tandem domain duplications and insertions or deletions of intrinsically disordered residues.
The study of two proteins, Nebulin and Filamin, together with a broader study of long repeat proteins (>10 domain repeats), began by confirming that tandem domains evolve by internal duplications. Next, we show that vertebrate Nebulins evolved by duplications of a seven-domain unit, yet the most recent duplications utilized different gene parts as duplication units. However, Filamin exhibits a checkered duplication pattern, indicating that duplications were followed by similarity erosions that were hindered at particular domains due to the presence of equivalent binding motifs. For long repeat proteins, we found that human segmental duplications are over-represented in long repeat genes. Additionally, domains that have formed long repeats achieved this primarily by duplications of two or more domains at a time.
The study of homologous protein pairs from the well-characterized eukaryotes nematode, fruit fly and several fungi, demonstrated a link between variation in length and variation in the number of intrinsically disordered residues. Next, insertions and deletions (indels) estimated from HMM-HMM pairwise alignments showed that disordered residues are clearly more frequent among indel than non-indel residues. Additionally, a study of raw length differences showed that more than half of the variation in fungi proteins is composed of disordered residues. Finally, a model of indels and their immediate surroundings suggested that disordered indels occur in already disordered regions rather than in ordered regions.
Place, publisher, year, edition, pages
Stockholm, Sweden: Department of Biochemistry and Biophysics, Stockholm University , 2013. , 32 p.
protein length, repeats, domain repeats, protein evolution, duplication, tandem duplication, intrinsic disorder, intrinsically disordered, variation in length, insertion, deletion, recombination, expansion, contraction
Bioinformatics and Systems Biology
Research subject Biochemistry
IdentifiersURN: urn:nbn:se:su:diva-88553ISBN: 978-91-7447-670-5OAI: oai:DiVA.org:su-88553DiVA: diva2:612043
2013-04-26, Högbomsalen, Geovetenskapens hus, Svante Arrhenius väg 12, Stockholm, 10:00 (English)
Kajava, Andrey, Professor
Elofsson, Arne, Professor
At the time of the doctoral defense, the following papers were unpublished and had a status as follows: Paper 2: In press. Paper 4: Manuscript.2013-04-042013-03-192013-03-29Bibliographically approved
List of papers