Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Domain architecture conservation in orthologs
Stockholm University, Faculty of Science, Department of Biochemistry and Biophysics. (Sonnhammer (SBC))
Stockholm University, Faculty of Science, Department of Biochemistry and Biophysics.
Stockholm University, Faculty of Science, Department of Biochemistry and Biophysics. (Sonnhammer (SBC))
2011 (English)In: BMC Bioinformatics, ISSN 1471-2105, E-ISSN 1471-2105, Vol. 12, 326- p.Article in journal (Refereed) Published
Abstract [en]

Background. As orthologous proteins are expected to retain function more often than other homologs, they are often used for functional annotation transfer between species. However, ortholog identification methods do not take into account changes in domain architecture, which are likely to modify a protein's function. By domain architecture we refer to the sequential arrangement of domains along a protein sequence. To assess the level of domain architecture conservation among orthologs, we carried out a large-scale study of such events between human and 40 other species spanning the entire evolutionary range. We designed a score to measure domain architecture similarity and used it to analyze differences in domain architecture conservation between orthologs and paralogs relative to the conservation of primary sequence. We also statistically characterized the extents of different types of domain swapping events across pairs of orthologs and paralogs.

Results. The analysis shows that orthologs exhibit greater domain architecture conservation than paralogous homologs, even when differences in average sequence divergence are compensated for, for homologs that have diverged beyond a certain threshold. We interpret this as an indication of a stronger selective pressure on orthologs than paralogs to retain the domain architecture required for the proteins to perform a specific function. In general, orthologs as well as the closest paralogous homologs have very similar domain architectures, even at large evolutionary separation. The most common domain architecture changes observed in both ortholog and paralog pairs involved insertion/deletion of new domains, while domain shuffling and segment duplication/deletion were very infrequent.

Conclusions. On the whole, our results support the hypothesis that function conservation between orthologs demands higher domain architecture conservation than other types of homologs, relative to primary sequence conservation. This supports the notion that orthologs are functionally more similar than other types of homologs at the same evolutionary distance.

Place, publisher, year, edition, pages
2011. Vol. 12, 326- p.
Keyword [en]
orthologous proteins, domain architecture, homologs
National Category
Bioinformatics and Systems Biology
Research subject
Biochemistry with Emphasis on Theoretical Chemistry
Identifiers
URN: urn:nbn:se:su:diva-60133DOI: 10.1186/1471-2105-12-326ISI: 000294948100001OAI: oai:DiVA.org:su-60133DiVA: diva2:433385
Available from: 2011-08-09 Created: 2011-08-09 Last updated: 2017-12-08Bibliographically approved
In thesis
1. The relationship between orthology, protein domain architecture and protein function
Open this publication in new window or tab >>The relationship between orthology, protein domain architecture and protein function
2011 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Lacking experimental data, protein function is often predicted from evolutionary and protein structure theory. Under the 'domain grammar' hypothesis the function of a protein follows from the domains it encodes. Under the 'orthology conjecture', orthologs, related through species formation, are expected to be more functionally similar than paralogs, which are homologs in the same or different species descended from a gene duplication event. However, these assumptions have not thus far been systematically evaluated.

To test the 'domain grammar' hypothesis, we built models for predicting function from the domain combinations present in a protein, and demonstrated that multi-domain combinations imply functions that the individual domains do not. We also developed a novel gene-tree based method for reconstructing the evolutionary histories of domain architectures, to search for cases of architectures that have arisen multiple times in parallel, and found this to be more common than previously reported.

To test the 'orthology conjecture', we first benchmarked methods for homology inference under the obfuscating influence of low-complexity regions, in order to improve the InParanoid orthology inference algorithm. InParanoid was then used to test the relative conservation of functionally relevant properties between orthologs and paralogs at various evolutionary distances, including intron positions, domain architectures, and Gene Ontology functional annotations.

We found an increased conservation of domain architectures in orthologs relative to paralogs, in support of the 'orthology conjecture' and the 'domain grammar' hypotheses acting in tandem. However, equivalent analysis of Gene Ontology functional conservation yielded spurious results, which may be an artifact of species-specific annotation biases in functional annotation databases. I discuss possible ways of circumventing this bias so the 'orthology conjecture' can be tested more conclusively.

Place, publisher, year, edition, pages
Stockholm: Department of Biochemistry and Biophysics, Stockholm University, 2011. 112 p.
Keyword
homology, orthology, paralogy, gene duplications, protein function prediction, low-complexity regions, protein domains, domain architecture evolution, introns, intron position conservation, orthology conjecture, domain grammar hypothesis
National Category
Bioinformatics and Systems Biology
Research subject
Biochemistry with Emphasis on Theoretical Chemistry
Identifiers
urn:nbn:se:su:diva-62152 (URN)978-91-7447-350-6 (ISBN)
Public defence
2011-10-24, Magnélisalen, Kemiska övningslaboratoriet, Svante Arrhenius väg 16 B, Stockholm, 14:00 (English)
Opponent
Supervisors
Note
At the time of the doctoral defense, the following paper was unpublished and had a status as follows: Paper 6: Epub ahead of print.Available from: 2011-10-02 Created: 2011-09-09 Last updated: 2011-10-06Bibliographically approved

Open Access in DiVA

No full text

Other links

Publisher's full text

Search in DiVA

By author/editor
Forslund, KristofferSonnhammer, Erik L. L.
By organisation
Department of Biochemistry and Biophysics
In the same journal
BMC Bioinformatics
Bioinformatics and Systems Biology

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 130 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf