Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Improved orthology inference with Hieranoid 2
Stockholm University, Faculty of Science, Department of Biochemistry and Biophysics. Stockholm University, Science for Life Laboratory (SciLifeLab).
Stockholm University, Faculty of Science, Department of Biochemistry and Biophysics. Stockholm University, Science for Life Laboratory (SciLifeLab).
Number of Authors: 22017 (English)In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 33, no 8, p. 1154-1159Article in journal (Refereed) Published
Abstract [en]

Motivation: The initial step in many orthology inference methods is the computationally demanding establishment of all pairwise protein similarities across all analysed proteomes. The quadratic scaling with proteomes has become a major bottleneck. A remedy is offered by the Hieranoid algorithm which reduces the complexity to linear by hierarchically aggregating ortholog groups from InParanoid along a species tree. Results: We have further developed the Hieranoid algorithm in many ways. Major improvements have been made to the construction of multiple sequence alignments and consensus sequences. Hieranoid version 2 was evaluated with standard benchmarks that reveal a dramatic increase in the coverage/accuracy tradeoff over version 1, such that it now compares favourably with the best methods. The new parallelized cluster mode allows Hieranoid to be run on large data sets in a much shorter timespan than InParanoid, yet at similar accuracy.

Place, publisher, year, edition, pages
2017. Vol. 33, no 8, p. 1154-1159
National Category
Biological Sciences
Research subject
Biochemistry towards Bioinformatics
Identifiers
URN: urn:nbn:se:su:diva-144722DOI: 10.1093/bioinformatics/btw774ISI: 000400985900007PubMedID: 28096085OAI: oai:DiVA.org:su-144722DiVA, id: diva2:1127955
Available from: 2017-07-20 Created: 2017-07-20 Last updated: 2018-04-27Bibliographically approved
In thesis
1. Functional Inference from Orthology and Domain Architecture
Open this publication in new window or tab >>Functional Inference from Orthology and Domain Architecture
2018 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Proteins are the basic building blocks of all living organisms. They play a central role in determining the structure of living beings and are required for essential chemical reactions. One of the main challenges in bioinformatics is to characterize the function of all proteins. The problem of understanding protein function can be approached by understanding their evolutionary history. Orthology analysis plays an important role in studying the evolutionary relation of proteins. Proteins are termed orthologs if they derive from a single gene in the species' last common ancestor, i.e. if they were separated by a speciation event. Orthologs are useful because they retain their function more often than other homologs. 

Inference of a complete set of orthologs for many species is computationally intensive. Currently, the fastest algorithms rely on graph-based approaches, which compare all-vs-all sequences and then cluster top hits into groups of orthologs. The initial step of performing all-vs-all comparisons is usually the primary computational challenge as it scales quadratically with the number of species. 

A new, more scalable and less computationally demanding method was developed to solve this problem without sacrificing accuracy. The Hieranoid 2 algorithm reduces computational complexity to almost linear by overcoming the necessity to perform all-vs-all similarity searches. The algorithm progresses along a known species tree, from leaves to root. Starting at the leaves, ortholog groups are predicted conventionally and then summarized at internal nodes to form pseudo-species. These pseudo-species are then re-used to search against other (pseudo-)species higher in the tree. This way the algorithm aggregates new ortholog groups hierarchically. The hierarchy is a natural structure to store and view large multi-species ortholog groups, and provides a complete picture of inferred evolutionary events. 

To facilitate explorative analysis of hierarchical groups of orthologs, a new online tool was created. The HieranoiDB website provides precomputed hierarchical groups of orthologs for a set of 66 species. It allows the user to search for orthology assignments using protein description, protein sequence, or species. Evolutionary events and meta information is added to the hierarchical groups of orthologs, which are shown graphically as interactive trees. This representation allows exploring, searching, and easier visual inspection of multi-species ortholog groups.

The majority of orthology prediction methods focus on treating the whole protein sequence as a single evolutionary unit. However, proteins are often composed of individual units, called protein domains, that can have different evolutionary histories. To extend the full sequence based methodology to a domain-aware method, a new approach called Domainoid is proposed. Here, domains are extracted from full-length sequences and subjected to orthology inference. This allows Domainoid to find orthology that would be missed by a full sequence approach.

Networks are a convenient graphical representation for showing a large number of functional associations between genes or proteins. They allow various analyses of graph properties, and can help visualize complex relationships. A framework for inferring comprehensive functional association networks was developed, called FunCoup. A major difference compared to other networks is FunCoup's extensive use of orthology relationships between species, which significantly boosts its coverage. Using naïve Bayesian classifiers to integrate 10 different evidence types and orthology transfer, FunCoup captures functional associations of many types, and provides comprehensive networks for 17 species across five gold-standards.

Place, publisher, year, edition, pages
Stockholm: Department of Biochemistry and Biophysics, Stockholm University, 2018. p. 38
Keywords
Orthology, Functional coupling networks, Association networks, Hierarchical groups of orthologs
National Category
Bioinformatics (Computational Biology)
Research subject
Biochemistry towards Bioinformatics
Identifiers
urn:nbn:se:su:diva-155096 (URN)978-91-7797-252-5 (ISBN)978-91-7797-253-2 (ISBN)
Public defence
2018-06-12, Magnélisalen, Kemiska övningslaboratoriet, Svante Arrhenius väg 16 B, Stockholm, 14:00 (English)
Opponent
Supervisors
Note

At the time of the doctoral defense, the following paper was unpublished and had a status as follows: Paper 3: Manuscript.

Available from: 2018-05-18 Created: 2018-04-24 Last updated: 2018-05-15Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textPubMed

Search in DiVA

By author/editor
Kaduk, MateuszSonnhammer, Erik
By organisation
Department of Biochemistry and BiophysicsScience for Life Laboratory (SciLifeLab)
In the same journal
Bioinformatics
Biological Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
pubmed
urn-nbn

Altmetric score

doi
pubmed
urn-nbn
Total: 25 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf