Domain Rearrangements in Protein Evolution
2005 (English)In: Journal of Molecular Biology, ISSN 0022-2836, E-ISSN 1089-8638, Vol. 353, no 4, 911-923 p.Article in journal (Refereed) Published
Most eukaryotic proteins are multi-domain proteins that are created from fusions of genes, deletions and internal repetitions. An investigation of such evolutionary events requires a method to find the domain architecture from which each protein originates. Therefore, we defined a novel measure, domain distance, which is calculated as the number of domains that differ between two domain architectures. Using this measure the evolutionary events that distinguish a protein from its closest ancestor have been studied and it was found that indels are more common than internal repetition and that the exchange of a domain is rare. Indels and repetitions are common at both the N and C-terminals while they are rare between domains. The evolution of the majority of multi-domain proteins can be explained by the stepwise insertions of single domains, with the exception of repeats that sometimes are duplicated several domains in tandem. We show that domain distances agree with sequence similarity and semantic similarity based on gene ontology annotations. In addition, we demonstrate the use of the domain distance measure to build evolutionary trees. Finally, the evolution of multi-domain proteins is exemplified by a closer study of the evolution of two protein families, non-receptor tyrosine kinases and RhoGEFs.
Place, publisher, year, edition, pages
2005. Vol. 353, no 4, 911-923 p.
protein evolution; multi-domain proteins; proteome; GOGraph; Pfam
IdentifiersURN: urn:nbn:se:su:diva-25576DOI: 10.1016/j.jmb.2005.08.067OAI: oai:DiVA.org:su-25576DiVA: diva2:200000
Part of urn:nbn:se:su:diva-82952008-11-062008-10-272014-11-10Bibliographically approved