Endre søk
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Fast and accurate gene regulatory network inference by normalized least squares regression
Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik. Stockholms universitet, Science for Life Laboratory (SciLifeLab).ORCID-id: 0000-0002-6362-0659
Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik. Stockholms universitet, Science for Life Laboratory (SciLifeLab).ORCID-id: 0000-0001-8284-356x
Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik. Stockholms universitet, Science for Life Laboratory (SciLifeLab).ORCID-id: 0000-0002-9015-5588
Rekke forfattare: 52022 (engelsk)Inngår i: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 38, nr 8, s. 2263-2268, artikkel-id btac103Artikkel i tidsskrift (Fagfellevurdert) Published
Abstract [en]

Motivation: Inferring an accurate gene regulatory network (GRN) has long been a key goal in the field of systems biology. To do this, it is important to find a suitable balance between the maximum number of true positive and the minimum number of false-positive interactions. Another key feature is that the inference method can handle the large size of modern experimental data, meaning the method needs to be both fast and accurate. The Least Squares Cut-Off (LSCO) method can fulfill both these criteria, however as it is based on least squares it is vulnerable to known issues of amplifying extreme values, small or large. In GRN this manifests itself with genes that are erroneously hyper-connected to a large fraction of all genes due to extremely low value fold changes.

Results: We developed a GRN inference method called Least Squares Cut-Off with Normalization (LSCON) that tackles this problem. LSCON extends the LSCO algorithm by regularization to avoid hyper-connected genes and thereby reduce false positives. The regularization used is based on normalization, which removes effects of extreme values on the fit. We benchmarked LSCON and compared it to Genie3, LASSO, LSCO and Ridge regression, in terms of accuracy, speed and tendency to predict hyper-connected genes. The results show that LSCON achieves better or equal accuracy compared to LASSO, the best existing method, especially for data with extreme values. Thanks to the speed of least squares regression, LSCON does this an order of magnitude faster than LASSO.

sted, utgiver, år, opplag, sider
2022. Vol. 38, nr 8, s. 2263-2268, artikkel-id btac103
HSV kategori
Identifikatorer
URN: urn:nbn:se:su:diva-203209DOI: 10.1093/bioinformatics/btac103ISI: 000761598600001PubMedID: 35176145Scopus ID: 2-s2.0-85128723779OAI: oai:DiVA.org:su-203209DiVA, id: diva2:1647800
Tilgjengelig fra: 2022-03-28 Laget: 2022-03-28 Sist oppdatert: 2023-09-14bibliografisk kontrollert
Inngår i avhandling
1. In silico modelling for refining gene regulatory network inference
Åpne denne publikasjonen i ny fane eller vindu >>In silico modelling for refining gene regulatory network inference
2023 (engelsk)Doktoravhandling, med artikler (Annet vitenskapelig)
Abstract [en]

Gene regulation is at the centre of all cellular functions, regulating the cell's healthy and pathological responses. The interconnected system of regulatory interactions is known as the gene regulatory network (GRN), where genes influence each other to maintain strict and robust control. Today a large number of methods exist for inferring GRNs, which necessitates benchmarking to determine which method is most suitable for a specific goal. Paper I presents such a benchmark focusing on the effect of using known perturbations to infer GRNs. 

A further challenge when studying GRNs is that experimental data contains high levels of noise and that artefacts may be introduced by the experiment itself. The LSCON method was developed in paper II to reduce the effect of one such artefact that can occur if the expression of a gene shows no or minimal change across most or all experiments. 

 With few fully determined biological GRNs available, it is problematic to use these to evaluate an inference method's correctness. Instead, the GRN field relies on simulated data, using a known GRN and generating the corresponding data. When simulating GRNs, capturing the topological properties of the biological GRN is vital. The FFLatt algorithm was developed in paper III to create scale-free, feed-forward loop motif-enriched GRNs, capturing two of the most prominent topological features in biological GRNs. 

 Once a high-quality GRN is obtained, the next step is to simulate gene expression data corresponding to the GRN. In paper IV, building on the FFLatt method, an open-source Python simulation tool called GeneSNAKE was developed to generate expression data for benchmarking purposes. GeneSNAKE allows the user to control a wide range of network and data properties and improves on previous tools by featuring a variety of perturbation schemes along with the ability to control noise and modify the perturbation strength.

sted, utgiver, år, opplag, sider
Stockohlm: Department of Biochemistry and Biophysics, Stockholm University, 2023. s. 49
Emneord
Gene regulatory networks, simulation, benchmarking, method development
HSV kategori
Forskningsprogram
biokemi med inriktning mot bioinformatik
Identifikatorer
urn:nbn:se:su:diva-221155 (URN)978-91-8014-504-6 (ISBN)978-91-8014-505-3 (ISBN)
Disputas
2023-10-27, Air and Fire, SciLifeLab, Tomtebodavägen 23A, Solna, 14:00 (engelsk)
Opponent
Veileder
Tilgjengelig fra: 2023-10-04 Laget: 2023-09-14 Sist oppdatert: 2025-02-07bibliografisk kontrollert

Open Access i DiVA

Fulltekst mangler i DiVA

Andre lenker

Forlagets fulltekstPubMedScopus

Person

Hillerton, ThomasSeçilmiş, DenizSonnhammer, Erik L. L.

Søk i DiVA

Av forfatter/redaktør
Hillerton, ThomasSeçilmiş, DenizSonnhammer, Erik L. L.
Av organisasjonen
I samme tidsskrift
Bioinformatics

Søk utenfor DiVA

GoogleGoogle Scholar

doi
pubmed
urn-nbn

Altmetric

doi
pubmed
urn-nbn
Totalt: 76 treff
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf