Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
GeneSPIDER - gene regulatory network inference benchmarking with controlled network and data properties
Stockholm University, Faculty of Science, Department of Biochemistry and Biophysics. Stockholm University, Science for Life Laboratory (SciLifeLab). Linköping University, Sweden.
Stockholm University, Faculty of Science, Department of Biochemistry and Biophysics. Stockholm University, Science for Life Laboratory (SciLifeLab).ORCID iD: 0000-0001-8326-6178
Stockholm University, Faculty of Science, Department of Biochemistry and Biophysics. Stockholm University, Science for Life Laboratory (SciLifeLab).
Show others and affiliations
Number of Authors: 52017 (English)In: Molecular Biosystems, ISSN 1742-206X, E-ISSN 1742-2051, Vol. 13, no 7, p. 1304-1312Article in journal (Refereed) Published
Abstract [en]

A key question in network inference, that has not been properly answered, is what accuracy can be expected for a given biological dataset and inference method. We present GeneSPIDER - a Matlab package for tuning, running, and evaluating inference algorithms that allows independent control of network and data properties to enable data-driven benchmarking. GeneSPIDER is uniquely suited to address this question by first extracting salient properties from the experimental data and then generating simulated networks and data that closely match these properties. It enables data-driven algorithm selection, estimation of inference accuracy from biological data, and a more multifaceted benchmarking. Included are generic pipelines for the design of perturbation experiments, bootstrapping, analysis of linear dependence, sample selection, scaling of SNR, and performance evaluation. With GeneSPIDER we aim to move the goal of network inference benchmarks from simple performance measurement to a deeper understanding of how the accuracy of an algorithm is determined by different combinations of network and data properties.

Place, publisher, year, edition, pages
2017. Vol. 13, no 7, p. 1304-1312
National Category
Biological Sciences
Research subject
Biochemistry towards Bioinformatics
Identifiers
URN: urn:nbn:se:su:diva-145342DOI: 10.1039/c7mb00058hISI: 000404471900005PubMedID: 28485748OAI: oai:DiVA.org:su-145342DiVA, id: diva2:1128688
Available from: 2017-07-27 Created: 2017-07-27 Last updated: 2022-02-28Bibliographically approved
In thesis
1. Towards Reliable Gene Regulatory Network Inference
Open this publication in new window or tab >>Towards Reliable Gene Regulatory Network Inference
2019 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Phenotypic traits are now known to stem from the interplay between genetic variables across many if not every level of biology. The field of gene regulatory network (GRN) inference is concerned with understanding the regulatory interactions between genes in a cell, in order to build a model that captures the behaviour of the system. Perturbation biology, whereby genes or RNAs are targeted and their activity altered, is of great value for the GRN field. By first systematically perturbing the system and then reading the system's reaction as a whole, we can feed this data into various methods to reverse engineer the key agents of change.

The initial study sets the groundwork for the rest, and deals with finding common ground among the sundry methods in order to compare and rank performance in an unbiased setting. The GeneSPIDER (GS) MATLAB package is an inference benchmarking platform whereby methods can be added via a wrapper for testing in competition with one another. Synthetic datasets and networks spanning a wide range of conditions can be created for this purpose. The evaluation of methods across various conditions in the benchmark therein demonstrates which properties influence the accuracy of which methods, and thus which are more suitable for use under given characterized condition.

The second study introduces a novel framework NestBoot for increasing inference accuracy within the GS environment by independent, nested bootstraps, \ie repeated inference trials. Under low to medium noise levels, this allows support to be gathered for links occurring most often while spurious links are discarded through comparison to an estimated null distribution of shuffled-links. While noise continues to plague every method, nested bootstrapping in this way is shown to increase the accuracy of several different methods.

The third study applies NestBoot on real data to infer a reliable GRN from an small interfering RNA (siRNA) perturbation dataset covering 40 genes known or suspected to have a role in human cancers. Methods were developed to benchmark the accuracy of an inferred GRN in the absence of a true known GRN, by assessing how well it fits the data compared to a null model of shuffled topologies. A network of high confidence was recovered containing many regulatory links known in the literature, as well as a slew of novel links.

The fourth study seeks to infer reliable networks on large scale, utilizing the high dimensional biological datasets of the LINCS L1000 project.  This dataset has too much noise for accurate GRN inference as a whole, hence we developed a method to select a  subset that is sufficiently informative to accurately infer GRNs. This is a first step in the direction of identifying probable submodules within a greater genome-scale GRN yet to be uncovered.

Place, publisher, year, edition, pages
Stockholm: Department of Biochemistry and Biophysics, Stockholm University, 2019. p. 40
Keywords
GRN, network inference, biological systems
National Category
Bioinformatics and Computational Biology
Research subject
Biochemistry towards Bioinformatics
Identifiers
urn:nbn:se:su:diva-164642 (URN)978-91-7797-600-4 (ISBN)978-91-7797-601-1 (ISBN)
Public defence
2019-04-05, Air & Fire, SciLifeLab, Tomtebodavägen 23A, Solna, 14:00 (English)
Opponent
Supervisors
Note

At the time of the doctoral defense, the following papers were unpublished and had a status as follows: Paper 3: Manuscript. Paper 4: Manuscript.

Available from: 2019-03-13 Created: 2019-01-17 Last updated: 2025-02-07Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textPubMed

Authority records

Morgan, Daniel C.Sonnhammer, Erik L. L.

Search in DiVA

By author/editor
Morgan, Daniel C.Sonnhammer, Erik L. L.
By organisation
Department of Biochemistry and BiophysicsScience for Life Laboratory (SciLifeLab)
In the same journal
Molecular Biosystems
Biological Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
pubmed
urn-nbn

Altmetric score

doi
pubmed
urn-nbn
Total: 243 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf