Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
FunCoup 4: new species, data, and visualization
Stockholm University, Faculty of Science, Department of Biochemistry and Biophysics. Stockholm University, Science for Life Laboratory (SciLifeLab).
Stockholm University, Faculty of Science, Department of Biochemistry and Biophysics. Stockholm University, Science for Life Laboratory (SciLifeLab).
Stockholm University, Faculty of Science, Department of Biochemistry and Biophysics. Stockholm University, Science for Life Laboratory (SciLifeLab).
Stockholm University, Faculty of Science, Department of Biochemistry and Biophysics. Stockholm University, Science for Life Laboratory (SciLifeLab).
Number of Authors: 42018 (English)In: Nucleic Acids Research, ISSN 0305-1048, E-ISSN 1362-4962, Vol. 46, no D1, p. D601-D607Article in journal (Refereed) Published
Abstract [en]

This release of the FunCoup database ( http://funcoup.sbc.su.se) is the fourth generation of one of the most comprehensive databases for genome-wide functional association networks. These functional associations are inferred via integrating various data types using a naive Bayesian algorithm and orthology based information transfer across different species. This approach provides high coverage of the included genomes as well as high quality of inferred interactions. In this update of FunCoup we introduce four new eukaryotic species: Schizosaccharomyces pombe, Plasmodium falciparum, Bos taurus, Oryza sativa and open the database to the prokaryotic domain by including networks for Escherichia coli and Bacillus subtilis. The latter allows us to also introduce a new class of functional association between genes - co-occurrence in the same operon. We also supplemented the existing classes of functional association: metabolic, signaling, complex and physical protein interaction with up-to-date information. In this release we switched to InParanoid v8 as the source of orthology and base for calculation of phylogenetic profiles. While populating all other evidence types with new data we introduce a new evidence type based on quantitative mass spectrometry data. Finally, the newJavaScript based network viewer provides the user an intuitive and responsive platform to further evaluate the results.

Place, publisher, year, edition, pages
2018. Vol. 46, no D1, p. D601-D607
National Category
Biological Sciences
Research subject
Biochemistry towards Bioinformatics
Identifiers
URN: urn:nbn:se:su:diva-152557DOI: 10.1093/nar/gkx1138ISI: 000419550700091PubMedID: 29165593OAI: oai:DiVA.org:su-152557DiVA, id: diva2:1183781
Available from: 2018-02-19 Created: 2018-02-19 Last updated: 2018-04-27Bibliographically approved
In thesis
1. Global functional association network inference and crosstalk analysis for pathway annotation
Open this publication in new window or tab >>Global functional association network inference and crosstalk analysis for pathway annotation
2017 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Cell functions are steered by complex interactions of gene products, like forming a temporary or stable complex, altering gene expression or catalyzing a reaction. Mapping these interactions is the key in understanding biological processes and therefore is the focus of numerous experiments and studies. Small-scale experiments deliver high quality data but lack coverage whereas high-throughput techniques cover thousands of interactions but can be error-prone. Unfortunately all of these approaches can only focus on one type of interaction at the time. This makes experimental mapping of the genome-wide network a cost and time intensive procedure. However, to overcome these problems, different computational approaches have been suggested that integrate multiple data sets and/or different evidence types. This widens the stringent definition of an interaction and introduces a more general term - functional association. 

FunCoup is a database for genome-wide functional association networks of Homo sapiens and 16 model organisms. FunCoup distinguishes between five different functional associations: co-membership in a protein complex, physical interaction, participation in the same signaling cascade, participation in the same metabolic process and for prokaryotic species, co-occurrence in the same operon. For each class, FunCoup applies naive Bayesian integration of ten different evidence types of data, to predict novel interactions. It further uses orthologs to transfer interaction evidence between species. This considerably increases coverage, and allows inference of comprehensive networks even for not well studied organisms. 

BinoX is a novel method for pathway analysis and determining the relation between gene sets, using functional association networks. Traditionally, pathway annotation has been done using gene overlap only, but these methods only get a small part of the whole picture. Placing the gene sets in context of a network provides additional evidence for pathway analysis, revealing a global picture based on the whole genome.

PathwAX is a web server based on the BinoX algorithm. A user can input a gene set and get online network crosstalk based pathway annotation. PathwAX uses the FunCoup networks and 280 pre-defined pathways. Most runs take just a few seconds and the results are summarized in an interactive chart the user can manipulate to gain further insights of the gene set's pathway associations.

Place, publisher, year, edition, pages
Stockholm: Department of Biochemistry and Biophysics, Stockholm University, 2017
Keywords
biological networks, genome wide functional association networks, global gene association networks, gene networks, protein networks, functional association, functional coupling, network biology pathway analysis, pathway annotation, pathway enrichment, network-based enrichment, enrichment
National Category
Bioinformatics and Systems Biology
Research subject
Biochemistry towards Bioinformatics
Identifiers
urn:nbn:se:su:diva-146703 (URN)978-91-7649-950-4 (ISBN)978-91-7649-951-1 (ISBN)
Public defence
2017-10-20, Magnélisalen, Kemiska övningslaboratoriet, Svante Arrhenius väg 16 B, Stochkolm, 13:00 (English)
Opponent
Supervisors
Note

At the time of the doctoral defense, the following paper was unpublished and had a status as follows: Paper 2: Manuscript.

Available from: 2017-09-27 Created: 2017-09-06 Last updated: 2018-04-27Bibliographically approved
2. Functional association networks for disease gene prediction
Open this publication in new window or tab >>Functional association networks for disease gene prediction
2017 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Mapping of the human genome has been instrumental in understanding diseasescaused by changes in single genes. However, disease mechanisms involvingmultiple genes have proven to be much more elusive. Their complexityemerges from interactions of intracellular molecules and makes them immuneto the traditional reductionist approach. Only by modelling this complexinteraction pattern using networks is it possible to understand the emergentproperties that give rise to diseases.The overarching term used to describe both physical and indirect interactionsinvolved in the same functions is functional association. FunCoup is oneof the most comprehensive networks of functional association. It uses a naïveBayesian approach to integrate high-throughput experimental evidence of intracellularinteractions in humans and multiple model organisms. In the firstupdate, both the coverage and the quality of the interactions, were increasedand a feature for comparing interactions across species was added. The latestupdate involved a complete overhaul of all data sources, including a refinementof the training data and addition of new class and sources of interactionsas well as six new species.Disease-specific changes in genes can be identified using high-throughputgenome-wide studies of patients and healthy individuals. To understand theunderlying mechanisms that produce these changes, they can be mapped tocollections of genes with known functions, such as pathways. BinoX wasdeveloped to map altered genes to pathways using the topology of FunCoup.This approach combined with a new random model for comparison enables BinoXto outperform traditional gene-overlap-based methods and other networkbasedtechniques.Results from high-throughput experiments are challenged by noise and biases,resulting in many false positives. Statistical attempts to correct for thesechallenges have led to a reduction in coverage. Both limitations can be remediedusing prioritisation tools such as MaxLink, which ranks genes using guiltby association in the context of a functional association network. MaxLink’salgorithm was generalised to work with any disease phenotype and its statisticalfoundation was strengthened. MaxLink’s predictions were validatedexperimentally using FRET.The availability of prioritisation tools without an appropriate way to comparethem makes it difficult to select the correct tool for a problem domain.A benchmark to assess performance of prioritisation tools in terms of theirability to generalise to new data was developed. FunCoup was used for prioritisationwhile testing was done using cross-validation of terms derived fromGene Ontology. This resulted in a robust and unbiased benchmark for evaluationof current and future prioritisation tools. Surprisingly, previously superiortools based on global network structure were shown to be inferior to a localnetwork-based tool when performance was analysed on the most relevant partof the output, i.e. the top ranked genes.This thesis demonstrates how a network that models the intricate biologyof the cell can contribute with valuable insights for researchers that study diseaseswith complex genetic origins. The developed tools will help the researchcommunity to understand the underlying causes of such diseases and discovernew treatment targets. The robust way to benchmark such tools will help researchersto select the proper tool for their problem domain.

Place, publisher, year, edition, pages
Stockholm: Department of Biochemistry and Biophysics, Stockholm University, 2017. p. 64
Keywords
network biology, biological networks, network prediction, functional association, functional coupling, network integration, functional association networks, genome-wide association networks, gene networks, protein networks, fret, functional enrichment analysis, network cross-talk, pathway annotation, gene prioritisation, network-based gene prioritization, benchmarking
National Category
Bioinformatics and Systems Biology
Research subject
Biochemistry towards Bioinformatics
Identifiers
urn:nbn:se:su:diva-147217 (URN)978-91-7649-976-4 (ISBN)978-91-7649-977-1 (ISBN)
Public defence
2017-11-10, Magnélisalen, Kemiska övningslaboratoriet, Svante Arrhenius väg 16 B, Stockholm, 14:00 (English)
Opponent
Supervisors
Note

At the time of the doctoral defense, the following papers were unpublished and had a status as follows: Paper 5: Manuscript. Paper 6: Manuscript.

Available from: 2017-10-18 Created: 2017-09-29 Last updated: 2018-04-27Bibliographically approved
3. Functional Inference from Orthology and Domain Architecture
Open this publication in new window or tab >>Functional Inference from Orthology and Domain Architecture
2018 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Proteins are the basic building blocks of all living organisms. They play a central role in determining the structure of living beings and are required for essential chemical reactions. One of the main challenges in bioinformatics is to characterize the function of all proteins. The problem of understanding protein function can be approached by understanding their evolutionary history. Orthology analysis plays an important role in studying the evolutionary relation of proteins. Proteins are termed orthologs if they derive from a single gene in the species' last common ancestor, i.e. if they were separated by a speciation event. Orthologs are useful because they retain their function more often than other homologs. 

Inference of a complete set of orthologs for many species is computationally intensive. Currently, the fastest algorithms rely on graph-based approaches, which compare all-vs-all sequences and then cluster top hits into groups of orthologs. The initial step of performing all-vs-all comparisons is usually the primary computational challenge as it scales quadratically with the number of species. 

A new, more scalable and less computationally demanding method was developed to solve this problem without sacrificing accuracy. The Hieranoid 2 algorithm reduces computational complexity to almost linear by overcoming the necessity to perform all-vs-all similarity searches. The algorithm progresses along a known species tree, from leaves to root. Starting at the leaves, ortholog groups are predicted conventionally and then summarized at internal nodes to form pseudo-species. These pseudo-species are then re-used to search against other (pseudo-)species higher in the tree. This way the algorithm aggregates new ortholog groups hierarchically. The hierarchy is a natural structure to store and view large multi-species ortholog groups, and provides a complete picture of inferred evolutionary events. 

To facilitate explorative analysis of hierarchical groups of orthologs, a new online tool was created. The HieranoiDB website provides precomputed hierarchical groups of orthologs for a set of 66 species. It allows the user to search for orthology assignments using protein description, protein sequence, or species. Evolutionary events and meta information is added to the hierarchical groups of orthologs, which are shown graphically as interactive trees. This representation allows exploring, searching, and easier visual inspection of multi-species ortholog groups.

The majority of orthology prediction methods focus on treating the whole protein sequence as a single evolutionary unit. However, proteins are often composed of individual units, called protein domains, that can have different evolutionary histories. To extend the full sequence based methodology to a domain-aware method, a new approach called Domainoid is proposed. Here, domains are extracted from full-length sequences and subjected to orthology inference. This allows Domainoid to find orthology that would be missed by a full sequence approach.

Networks are a convenient graphical representation for showing a large number of functional associations between genes or proteins. They allow various analyses of graph properties, and can help visualize complex relationships. A framework for inferring comprehensive functional association networks was developed, called FunCoup. A major difference compared to other networks is FunCoup's extensive use of orthology relationships between species, which significantly boosts its coverage. Using naïve Bayesian classifiers to integrate 10 different evidence types and orthology transfer, FunCoup captures functional associations of many types, and provides comprehensive networks for 17 species across five gold-standards.

Place, publisher, year, edition, pages
Stockholm: Department of Biochemistry and Biophysics, Stockholm University, 2018. p. 38
Keywords
Orthology, Functional coupling networks, Association networks, Hierarchical groups of orthologs
National Category
Bioinformatics (Computational Biology)
Research subject
Biochemistry towards Bioinformatics
Identifiers
urn:nbn:se:su:diva-155096 (URN)978-91-7797-252-5 (ISBN)978-91-7797-253-2 (ISBN)
Public defence
2018-06-12, Magnélisalen, Kemiska övningslaboratoriet, Svante Arrhenius väg 16 B, Stockholm, 14:00 (English)
Opponent
Supervisors
Note

At the time of the doctoral defense, the following paper was unpublished and had a status as follows: Paper 3: Manuscript.

Available from: 2018-05-18 Created: 2018-04-24 Last updated: 2018-05-15Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textPubMed

Search in DiVA

By author/editor
Ogris, ChristophGuala, DimitriKaduk, MateuszSonnhammer, Erik L. L.
By organisation
Department of Biochemistry and BiophysicsScience for Life Laboratory (SciLifeLab)
In the same journal
Nucleic Acids Research
Biological Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
pubmed
urn-nbn

Altmetric score

doi
pubmed
urn-nbn
Total: 14 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf