Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
GeneSNAKE: a Python package for benchmarking and simulation of gene regulatory networks and expression data.
Stockholm University, Faculty of Science, Department of Biochemistry and Biophysics. (Sonnhammer)ORCID iD: 0000-0002-6362-0659
Stockholm University, Faculty of Science, Stockholm Resilience Centre.ORCID iD: 0000-0001-8492-5649
Stockholm University, Faculty of Science, Department of Biochemistry and Biophysics. (Sonnhammer)ORCID iD: 0000-0002-2497-194X
Stockholm University, Faculty of Science, Department of Biochemistry and Biophysics. (Sonnhammer)ORCID iD: 0000-0002-9015-5588
(English)Manuscript (preprint) (Other academic)
Abstract [en]

Understanding how genes interact with and regulate each other is a key challenge in systems biology. One of the primary methods to study this is through gene regulatory networks (GRNs). The field of GRN inference however faces many challenges, such as the complexity of gene regulation and high noise levels, which necessitates effective tools for evaluating inference methods. For this purpose, data that corresponds to a known GRN, from various conditions and experimental setups is necessary, which is only possible to attain via simulation.  Existing tools for simulating data for GRN inference have limitations either in the way networks are constructed or data is produced, and are often not flexible for adjusting the algorithm or parameters. 

To overcome these issues we present GeneSNAKE, a Python package designed to allow users to generate biologically realistic GRNs, and from a GRN simulate expression data for benchmarking purposes. GeneSNAKE allows the user to control a wide range of network and data properties. GeneSNAKE improves on previous work in the field by adding a perturbation model that allows for a greater range of perturbation schemes along with the ability to control noise and modify the perturbation strength. 

For benchmarking, GeneSNAKE offers a number of functions both for comparing a true GRN to an inferred GRN, and to study properties in data and GRN models. These functions can in addition be used to study properties of biological data to produce simulated data with more realistic properties.  GeneSNAKE is an open-source, comprehensive simulation and benchmarking package with powerful capabilities that are not combined in any other single package, and thanks to the Python implementation it is simple to extend and modify by a user.

Keywords [en]
Gene regulatory networks, simulation, benchmarking, method development
National Category
Bioinformatics and Systems Biology
Research subject
Biochemistry towards Bioinformatics
Identifiers
URN: urn:nbn:se:su:diva-221154OAI: oai:DiVA.org:su-221154DiVA, id: diva2:1797458
Available from: 2023-09-14 Created: 2023-09-14 Last updated: 2023-09-14
In thesis
1. In silico modelling for refining gene regulatory network inference
Open this publication in new window or tab >>In silico modelling for refining gene regulatory network inference
2023 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Gene regulation is at the centre of all cellular functions, regulating the cell's healthy and pathological responses. The interconnected system of regulatory interactions is known as the gene regulatory network (GRN), where genes influence each other to maintain strict and robust control. Today a large number of methods exist for inferring GRNs, which necessitates benchmarking to determine which method is most suitable for a specific goal. Paper I presents such a benchmark focusing on the effect of using known perturbations to infer GRNs. 

A further challenge when studying GRNs is that experimental data contains high levels of noise and that artefacts may be introduced by the experiment itself. The LSCON method was developed in paper II to reduce the effect of one such artefact that can occur if the expression of a gene shows no or minimal change across most or all experiments. 

 With few fully determined biological GRNs available, it is problematic to use these to evaluate an inference method's correctness. Instead, the GRN field relies on simulated data, using a known GRN and generating the corresponding data. When simulating GRNs, capturing the topological properties of the biological GRN is vital. The FFLatt algorithm was developed in paper III to create scale-free, feed-forward loop motif-enriched GRNs, capturing two of the most prominent topological features in biological GRNs. 

 Once a high-quality GRN is obtained, the next step is to simulate gene expression data corresponding to the GRN. In paper IV, building on the FFLatt method, an open-source Python simulation tool called GeneSNAKE was developed to generate expression data for benchmarking purposes. GeneSNAKE allows the user to control a wide range of network and data properties and improves on previous tools by featuring a variety of perturbation schemes along with the ability to control noise and modify the perturbation strength.

Place, publisher, year, edition, pages
Stockohlm: Department of Biochemistry and Biophysics, Stockholm University, 2023. p. 49
Keywords
Gene regulatory networks, simulation, benchmarking, method development
National Category
Bioinformatics and Systems Biology
Research subject
Biochemistry towards Bioinformatics
Identifiers
urn:nbn:se:su:diva-221155 (URN)978-91-8014-504-6 (ISBN)978-91-8014-505-3 (ISBN)
Public defence
2023-10-27, Air and Fire, SciLifeLab, Tomtebodavägen 23A, Solna, 14:00 (English)
Opponent
Supervisors
Available from: 2023-10-04 Created: 2023-09-14 Last updated: 2023-09-29Bibliographically approved

Open Access in DiVA

No full text in DiVA

Authority records

Hillerton, ThomasSonnhammer, Erik L. L.

Search in DiVA

By author/editor
Hillerton, ThomasErik K., ZhivkopliasGarbulowski, MateuszSonnhammer, Erik L. L.
By organisation
Department of Biochemistry and BiophysicsStockholm Resilience Centre
Bioinformatics and Systems Biology

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 220 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf