Ändra sökning
Avgränsa sökresultatet
123 1 - 50 av 136
RefereraExporteraLänk till träfflistan
Permanent länk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Träffar per sida
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sortering
  • Standard (Relevans)
  • Författare A-Ö
  • Författare Ö-A
  • Titel A-Ö
  • Titel Ö-A
  • Publikationstyp A-Ö
  • Publikationstyp Ö-A
  • Äldst först
  • Nyast först
  • Skapad (Äldst först)
  • Skapad (Nyast först)
  • Senast uppdaterad (Äldst först)
  • Senast uppdaterad (Nyast först)
  • Disputationsdatum (tidigaste först)
  • Disputationsdatum (senaste först)
  • Standard (Relevans)
  • Författare A-Ö
  • Författare Ö-A
  • Titel A-Ö
  • Titel Ö-A
  • Publikationstyp A-Ö
  • Publikationstyp Ö-A
  • Äldst först
  • Nyast först
  • Skapad (Äldst först)
  • Skapad (Nyast först)
  • Senast uppdaterad (Äldst först)
  • Senast uppdaterad (Nyast först)
  • Disputationsdatum (tidigaste först)
  • Disputationsdatum (senaste först)
Markera
Maxantalet träffar du kan exportera från sökgränssnittet är 250. Vid större uttag använd dig av utsökningar.
  • 1. Abraham, Mark
    et al.
    Apostolov, Rossen
    Barnoud, Jonathan
    Bauer, Paul
    Blau, Christian
    Bonvin, Alexandre M. J. J.
    Chavent, Matthieu
    Chodera, John
    Condic-Jurkic, Karmen
    Delemotte, Lucie
    Grubmueller, Helmut
    Howard, Rebecca J.
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik. Stockholms universitet, Science for Life Laboratory (SciLifeLab).
    Jordan, E. Joseph
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik. Stockholms universitet, Science for Life Laboratory (SciLifeLab).
    Lindahl, Erik
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik. Stockholms universitet, Science for Life Laboratory (SciLifeLab). KTH Royal Institute of Technology, Sweden.
    Ollila, O. H. Samuli
    Selent, Jana
    Smith, Daniel G. A.
    Stansfeld, Phillip J.
    Tiemann, Johanna K. S.
    Trellet, Mikael
    Woods, Christopher
    Zhmurov, Artem
    Sharing Data from Molecular Simulations2019Ingår i: Journal of Chemical Information and Modeling, ISSN 1549-9596, E-ISSN 1549-960X, Vol. 59, nr 10, s. 4093-4099Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Given the need for modern researchers to produce open, reproducible scientific output, the lack of standards and best practices for sharing data and workflows used to produce and analyze molecular dynamics (MD) simulations has become an important issue in the field. There are now multiple well-established packages to perform molecular dynamics simulations, often highly tuned for exploiting specific classes of hardware, each with strong communities surrounding them, but with very limited interoperability/transferability options. Thus, the choice of the software package often dictates the workflow for both simulation production and analysis. The level of detail in documenting the workflows and analysis code varies greatly in published work, hindering reproducibility of the reported results and the ability for other researchers to build on these studies. An increasing number of researchers are motivated to make their data available, but many challenges remain in order to effectively share and reuse simulation data. To discuss these and other issues related to best practices in the field in general, we organized a workshop in November 2018 (https://bioexcel.eu/events/workshop-on-sharing-data-from-molecular-simulations/). Here, we present a brief overview of this workshop and topics discussed. We hope this effort will spark further conversation in the MD community to pave the way toward more open, interoperable, and reproducible outputs coming from research studies using MD simulations.

  • 2.
    Alexeyenko, Andrey
    et al.
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik. Stockholms universitet, Science for Life Laboratory (SciLifeLab).
    Schmitt, Thomas
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik. Stockholms universitet, Science for Life Laboratory (SciLifeLab).
    Tjärnberg, Andreas
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik. Stockholms universitet, Science for Life Laboratory (SciLifeLab).
    Guala, Dmitri
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik. Stockholms universitet, Science for Life Laboratory (SciLifeLab).
    Frings, Oliver
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik. Stockholms universitet, Science for Life Laboratory (SciLifeLab).
    Sonnhammer, Erik L. L.
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik. Stockholms universitet, Science for Life Laboratory (SciLifeLab).
    Comparative interactomics with Funcoup 2.02012Ingår i: Nucleic Acids Research, ISSN 0305-1048, E-ISSN 1362-4962, Vol. 40, nr D1, s. D821-D828Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    FunCoup (http://FunCoup.sbc.su.se) is a database that maintains and visualizes global gene/protein networks of functional coupling that have been constructed by Bayesian integration of diverse high-throughput data. FunCoup achieves high coverage by orthology-based integration of data sources from different model organisms and from different platforms. We here present release 2.0 in which the data sources have been updated and the methodology has been refined. It contains a new data type Genetic Interaction, and three new species: chicken, dog and zebra fish. As FunCoup extensively transfers functional coupling information between species, the new input datasets have considerably improved both coverage and quality of the networks. The number of high-confidence network links has increased dramatically. For instance, the human network has more than eight times as many links above confidence 0.5 as the previous release. FunCoup provides facilities for analysing the conservation of subnetworks in multiple species. We here explain how to do comparative interactomics on the FunCoup website.

    Ladda ner fulltext (pdf)
    fulltext
  • 3. Allison, Timothy M.
    et al.
    Degiacomi, Matteo T.
    Marklund, Erik G.
    Jovine, Luca
    Elofsson, Arne
    Stockholms universitet, Science for Life Laboratory (SciLifeLab). Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik.
    Benesch, Justin L. P.
    Landreh, Michael
    Complementing machine learning-based structure predictions with native mass spectrometry2022Ingår i: Protein Science, ISSN 0961-8368, E-ISSN 1469-896X, Vol. 31, nr 6, artikel-id e4333Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    The advent of machine learning-based structure prediction algorithms such as AlphaFold2 (AF2) and RoseTTa Fold have moved the generation of accurate structural models for the entire cellular protein machinery into the reach of the scientific community. However, structure predictions of protein complexes are based on user-provided input and may require experimental validation. Mass spectrometry (MS) is a versatile, time-effective tool that provides information on post-translational modifications, ligand interactions, conformational changes, and higher-order oligomerization. Using three protein systems, we show that native MS experiments can uncover structural features of ligand interactions, homology models, and point mutations that are undetectable by AF2 alone. We conclude that machine learning can be complemented with MS to yield more accurate structural models on a small and large scale.

  • 4. Berglund, Ann-Charlotte
    et al.
    Sjölund, Erik
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik.
    Östlund, Gabriel
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik.
    Sonnhammer, Erik L. L.
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik.
    InParanoid 6: eukaryotic ortholog clusters with inparalogs2008Ingår i: Nucleic Acids Research, ISSN 0305-1048, E-ISSN 1362-4962, Vol. 36, s. D263-D266Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    The InParanoid eukaryotic ortholog database (http://InParanoid.sbc.su.se/) has been updated to version 6 and is now based on 35 species. We collected all available 'complete' eukaryotic proteomes and Escherichia coli, and calculated ortholog groups for all 595 species pairs using the InParanoid program. This resulted in 2 642 187 pairwise ortholog groups in total. The orthology-based species relations are presented in an orthophylogram. InParanoid clusters contain one or more orthologs from each of the two species. Multiple orthologs in the same species, i.e. inparalogs, result from gene duplications after the species divergence. A new InParanoid website has been developed which is optimized for speed both for users and for updating the system. The XML output format has been improved for efficient processing of the InParanoid ortholog clusters.

  • 5. Bidkhori, Gholamreza
    et al.
    Narimani, Zahra
    Hosseini Ashtiani, Saman
    Moeini, Ali
    Nowzari-Dalini, Abbas
    Masoudi-Nejad, Ali
    Reconstruction of an Integrated Genome-Scale Co-Expression Network Reveals Key Modules Involved in Lung Adenocarcinoma2013Ingår i: PLOS ONE, E-ISSN 1932-6203, Vol. 8, nr 7, s. e67552-e67552Artikel i tidskrift (Övrigt vetenskapligt)
  • 6.
    Björklund, Åsa
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik.
    Creation of new proteins - domain rearrangements and tandem duplications2010Doktorsavhandling, sammanläggning (Övrigt vetenskapligt)
    Abstract [en]

    Proteins are modular entities with domains as their building blocks. The domains are recurrent protein fragments with a distinct structure, function and evolutionary history. During evolution, proteins with new functions have been invented through rearrangements as well as differentiation of domains. The focus of this thesis is to gain better understanding of the processes that govern domain rearrangements. In particular, the rearrangements that create long protein domain repeats have been investigated in detail.

    We estimate that about 65% of the eukaryotic and 40% of the prokaryotic proteins are of the multidomain type. Further, we find that the eukaryotic multidomain proteins are mainly created through insertion of a single domain at the N- or C-terminus. However, domain repeats differ from other domain rearrangements in the aspect that they are created from internal tandem duplications. We show that such duplications often involve several domains simultaneously, and that different repeated domain families show distinct evolutionary patterns. Finally, we have investigated how large repeat regions are created using a specific example; the Actin binding nebulin domain. The analysis reveals several tandem duplications of both single nebulin domains and super repeats of seven nebulins in a number of vertebrates. We see that the duplication breakpoints vary between the species and that multiple duplications of the same region are common.

  • 7.
    Björklund, Åsa K.
    et al.
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik.
    Light, Sara
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik.
    Sagit, Rauan
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik.
    Elofsson, Arne
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik.
    Nebulin: A Study of Protein Repeat Evolution2010Ingår i: Journal of Molecular Biology, ISSN 0022-2836, E-ISSN 1089-8638, Vol. 402, nr 1, s. 38-51Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Protein domain repeats are common in proteins that are central to the organization of a cell, in particular in eukaryotes. They are known to evolve through internal tandem duplications. However, the understanding of the underlying mechanisms is incomplete. To shed light on repeat expansion mechanisms, we have studied the evolution of the muscle protein Nebulin, a protein that contains a large number of actin-binding nebulin domains. Nebulin proteins have evolved from an invertebrate precursor containing two nebulin domains. Repeat regions have expanded through duplications of single domains, as well as duplications of a super repeat (SR) consisting of seven nebulins. We show that the SR has evolved independently into large regions in at least three instances: twice in the invertebrate Branchiostoma floridae and once in vertebrates. In-depth analysis reveals several recent tandem duplications in the Nebulin gene. The events involve both single-domain and multidomain SR units or several SR units. There are single events, but frequently the same unit is duplicated multiple times. For instance, an ancestor of human and chimpanzee underwent two tandem duplications. The duplication junction coincides with an Alu transposon, thus suggesting duplication through Alu-mediated homologous recombination. Duplications in the SR region consistently involve multiples of seven domains. However, the exact unit that is duplicated varies both between species and within species. Thus, multiple tandem duplications of the same motif did not create the large Nebulin protein. Finally, analysis of segmental duplications in the human genome reveals that duplications are more common in genes containing domain repeats than in those coding for nonrepeated proteins. In fact, segmental duplications are found three to six times more often in long repeated genes than expected by chance. 

    Ladda ner fulltext (pdf)
    Fulltext
  • 8.
    Bryant, Patrick
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik.
    Learning Protein Evolution and Structure2022Doktorsavhandling, sammanläggning (Övrigt vetenskapligt)
    Abstract [en]

    By analysing the structure of a protein it is possible to draw conclusions about its function. Obtaining the structure of a protein experimentally is however a time consuming and expensive process. By using evolution it is possible to infer the structure of a protein. AlphaFold2 (AF), the latest AI technology for protein structure prediction, uses evolutionary information to obtain protein structures in minutes instead of years at a fraction of the experimental cost. Here, we develop this technology further to predict the structure of interacting proteins. We create a confidence score, pDockQ, and show that this score rivals high-throughput experiments in distinguishing true and false protein-protein interactions (PPIs). Applying AF and the pDockQ score to a set of 65484 human PPIs we identify 1371 new high-confidence models. These models expand the structural knowledge of human protein complexes and can be used to e.g. develop new drugs or evaluate biological pathways. One limitation of AF is that the accuracy decreases with the number of proteins being predicted together and that the biggest protein complexes do not fit in the memory of the latest GPUs. To circumvent these issues, we predict subcomponents of protein complexes and assemble these together with Monte Carlo Tree search (MCTS). MCTS enables assembling some of the largest protein complexes using only sequence information and stoichiometry. Out of 175 protein complexes with 10-30 chains, 91 can be completely assembled with a median TM-score of 0.51. A third of these (30 complexes) are highly accurate (TM-score ≥0.8). The use of highly accurate protein structure prediction is revolutionising many fiends of biological research only one year after its realisation. Likely, this is only the beginning of a new era; the era of AI.  

    Ladda ner fulltext (pdf)
    Learning Protein Evolution and Structure
    Ladda ner (jpg)
    Omslagsframsida
  • 9.
    Bryant, Patrick
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik.
    Predicting the structure of large proteincomplexes using AlphaFold and MonteCarlo tree searchManuskript (preprint) (Övrigt vetenskapligt)
    Abstract [en]

    AlphaFold can predict the structure of single- and multiple-chain proteins with very highaccuracy. However, the accuracy decreases with the number of chains, and the availableGPU memory limits the size of protein complexes which can be predicted. Here we showthat one can predict the structure of large complexes starting from predictions ofsubcomponents. We assemble 91 out of 175 complexes with 10-30 chains from predictedsubcomponents using Monte Carlo tree search, with a median TM-score of 0.51. There are30 highly accurate complexes (TM-score ≥0.8, 33% of complete assemblies). We create ascoring function, mpDockQ, that can distinguish if assemblies are complete and predict theiraccuracy. We find that complexes containing symmetry are accurately assembled, whileasymmetrical complexes remain challenging. The method is freely available and accesibleas a Colab notebookhttps://colab.research.google.com/github/patrickbryant1/MoLPC/blob/master/MoLPC.ipynb.

  • 10. Burke, David F.
    et al.
    Bryant, Patrick
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik. Stockholms universitet, Science for Life Laboratory (SciLifeLab).
    Barrio-Hernandez, Inigo
    Memon, Danish
    Pozzati, Gabriele
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik. Stockholms universitet, Science for Life Laboratory (SciLifeLab).
    Shenoy, Aditi
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik. Stockholms universitet, Science for Life Laboratory (SciLifeLab).
    Zhu, Wensi
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik. Stockholms universitet, Science for Life Laboratory (SciLifeLab).
    Dunham, Alistair S.
    Albanese, Pascal
    Keller, Andrew
    Scheltema, Richard A.
    Bruce, James E.
    Leitner, Alexander
    Kundrotas, Petras
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik. Stockholms universitet, Science for Life Laboratory (SciLifeLab). The University of Kansas, Lawrence, USA.
    Beltrao, Pedro
    Elofsson, Arne
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik. Stockholms universitet, Science for Life Laboratory (SciLifeLab).
    Towards a structurally resolved human protein interaction network2023Ingår i: Nature Structural & Molecular Biology, ISSN 1545-9993, E-ISSN 1545-9985, Vol. 30, nr 2, s. 216-225Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Cellular functions are governed by molecular machines that assemble through protein-protein interactions. Their atomic details are critical to studying their molecular mechanisms. However, fewer than 5% of hundreds of thousands of human protein interactions have been structurally characterized. Here we test the potential and limitations of recent progress in deep-learning methods using AlphaFold2 to predict structures for 65,484 human protein interactions. We show that experiments can orthogonally confirm higher-confidence models. We identify 3,137 high-confidence models, of which 1,371 have no homology to a known structure. We identify interface residues harboring disease mutations, suggesting potential mechanisms for pathogenic variants. Groups of interface phosphorylation sites show patterns of co-regulation across conditions, suggestive of coordinated tuning of multiple protein interactions as signaling responses. Finally, we provide examples of how the predicted binary complexes can be used to build larger assemblies helping to expand our understanding of human cell biology.

  • 11.
    Castresana Aguirre, Miguel
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik.
    From networks to pathway analysis2021Doktorsavhandling, sammanläggning (Övrigt vetenskapligt)
    Abstract [en]

    Biological mechanisms stem from complex intracellular interactions spanning across different levels of regulation. Mapping these interactions is fundamental for the understanding of all types of biological conditions, including complex diseases. Each experimental approach carries its own bias and noise. Combining heterogeneous data sources reduces noise and gives a broader sense of the interactions between genes known as functional association, where both direct and indirect interactions are captured.

    FunCoup is one of the most comprehensive functional association databases, providing networks for 22 organisms in all domains of life. FunCoup uses a naïve Bayesian integration approach to combine 11 different data types and increases the coverage by transferring associations between species via orthologs. Additional insights into the mechanisms of a gene network are provided through tissue specificity filtering and directed regulatory links.

    Even though FunCoup provides a comprehensive map of the intracellular machinery, gaining insights into conditions such as diseases requires a functional level analysis rather than a gene level analysis. Thus, studying genes that are involved in a condition from a functional perspective requires the usage of pathway enrichment analysis. Several approaches exist, from basic gene overlap to more elaborate analyses that use functional association networks. ANUBIX is a novel network-based analysis (NBA) method that overcomes the high false positive rate issue that previous state-of-the-art NBA approaches have. Additionally, even with accurate methods, a commonly ignored problem is that gene sets derived from experiments are often noisy or contain multiple mechanisms, mixing different pathways which weakens their association to the condition under study. To increase the sensitivity of pathway analysis, we developed a pipeline to cluster gene sets into more homogeneous parts with the aim of unraveling all the mechanisms activated in the studied condition. To facilitate the usage of these tools, we built a web server called PathBIX, a user-friendly platform that allows interactive analysis of all species in FunCoup against multiple pathway databases.

    Ladda ner fulltext (pdf)
    From networks to pathway analysis
    Ladda ner (jpg)
    presentationsbild
  • 12.
    Castresana Aguirre, Miguel
    et al.
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik.
    Guala, Dimitri
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik.
    Sonnhammer, Erik
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik.
    Clustered Pathway AnalysisManuskript (preprint) (Övrigt vetenskapligt)
    Abstract [en]

    Motivation: Functional analysis of gene sets derived from experiments is typically done by pathway annotation. Although many algorithms exist for analyzing the association between a gene set and a pathway, an issue which is generally ignored is that gene sets often represent multiple pathways. In such cases an association to a pathway is weakened by the presence of genes associated with other pathways. A way to counteract this is to cluster the gene set into more homogenous parts before performing pathway analysis on each cluster.

    Results: We explored whether network-based pre-clustering of a query gene set can improve pathway analysis. The methods MCL, Infomap, and MGclus were used to cluster the gene set projected onto the FunCoup network. We characterized how well these methods are able to detect individual pathways in multi-pathway gene sets, and applied each of the clustering methods in combination with four pathway analysis methods: Gene Enrichment Analysis, BinoX, NEAT, and ANUBIX. Using benchmarks constructed from the KEGG pathway database we found that clustering substantially increased the sensitivity of pathway analysis methods. For ANUBIX this came with almost no loss of specificity, while for BinoX and NEAT the specificity decreased roughly as much as the sensitivity increased. GEA had very low sensitivity both before and after clustering. The choice of clustering method only had a minor effect on the results. We conclude that clustering can improve overall pathway annotation performance, but only if the used enrichment method has a low false positive rate. 

    Availability and Implementation: https://bitbucket.org/sonnhammergroup/clustering-and-pathway-enrichment/

  • 13.
    Castresana-Aguirre, Miguel
    et al.
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik. Stockholms universitet, Science for Life Laboratory (SciLifeLab).
    Persson, Emma
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik. Stockholms universitet, Science for Life Laboratory (SciLifeLab).
    Sonnhammer, Erik L. L.
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik. Stockholms universitet, Science for Life Laboratory (SciLifeLab).
    PathBIX—a web server for network-based pathway annotation with adaptive null models2021Ingår i: Bioinformatics Advances, E-ISSN 2635-0041, Vol. 1, nr 1, artikel-id vbab010Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Motivation: Pathway annotation is a vital tool for interpreting and giving meaning to experimental data in life sciences. Numerous tools exist for this task, where the most recent generation of pathway enrichment analysis tools, network-based methods, utilize biological networks to gain a richer source of information as a basis of the analysis than merely the gene content. Network-based methods use the network crosstalk between the query gene set and the genes in known pathways, and compare this to a null model of random expectation.

    Results: We developed PathBIX, a novel web application for network-based pathway analysis, based on the recently published ANUBIX algorithm which has been shown to be more accurate than previous network-based methods. The PathBIX website performs pathway annotation for 21 species, and utilizes prefetched and preprocessed network data from FunCoup 5.0 networks and pathway data from three databases: KEGG, Reactome, and WikiPathways.

    Ladda ner (pdf)
    PathBIX
  • 14. Cheng, Jianlin
    et al.
    Choe, Myong‐Ho
    Elofsson, Arne
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik. Stockholms universitet, Science for Life Laboratory (SciLifeLab).
    Han, Kun-Sop
    Hou, Jie
    Maghrabi, Ali H. A.
    McGuffin, Liam J.
    Menéndez-Hurtado, David
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik. Stockholms universitet, Science for Life Laboratory (SciLifeLab).
    Olechnovič, Kliment
    Schwede, Torsten
    Studer, Gabriel
    Uziela, Karolis
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik. Stockholms universitet, Science for Life Laboratory (SciLifeLab).
    Venclovas, Česlovas
    Wallner, Björn
    Estimation of model accuracy in CASP132019Ingår i: Proteins: Structure, Function, and Bioinformatics, ISSN 0887-3585, E-ISSN 1097-0134, Vol. 87, nr 12, s. 1361-1377Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Methods to reliably estimate the accuracy of 3D models of proteins are both a fundamental part of most protein folding pipelines and important for reliable identification of the best models when multiple pipelines are used. Here, we describe the progress made from CASP12 to CASP13 in the field of estimation of model accuracy (EMA) as seen from the progress of the most successful methods in CASP13. We show small but clear progress, that is, several methods perform better than the best methods from CASP12 when tested on CASP13 EMA targets. Some progress is driven by applying deep learning and residue‐residue contacts to model accuracy prediction. We show that the best EMA methods select better models than the best servers in CASP13, but that there exists a great potential to improve this further. Also, according to the evaluation criteria based on local similarities, such as lDDT and CAD, it is now clear that single model accuracy methods perform relatively better than consensus‐based methods.

  • 15. Chicharro, Daniel
    et al.
    Ledberg, Anders
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Centrum för socialvetenskaplig alkohol- och drogforskning (SoRAD). Universitat Pompeu Fabra, Spain.
    Framework to study dynamic dependencies in networks of interacting processes2012Ingår i: Physical Review E. Statistical, Nonlinear, and Soft Matter Physics, ISSN 1539-3755, E-ISSN 1550-2376, Vol. 86, nr 4, artikel-id 041901Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    The analysis of dynamic dependencies in complex systems such as the brain helps to understand how emerging properties arise from interactions. Here we propose an information-theoretic framework to analyze the dynamic dependencies in multivariate time-evolving systems. This framework constitutes a fully multivariate extension and unification of previous approaches based on bivariate or conditional mutual information and Granger causality or transfer entropy. We define multi-information measures that allow us to study the global statistical structure of the system as a whole, the total dependence between subsystems, and the temporal statistical structure of each subsystem. We develop a stationary and a nonstationary formulation of the framework. We then examine different decompositions of these multi-information measures. The transfer entropy naturally appears as a term in some of these decompositions. This allows us to examine its properties not as an isolated measure of interdependence but in the context of the complete framework. More generally we use causal graphs to study the specificity and sensitivity of all the measures appearing in these decompositions to different sources of statistical dependence arising from the causal connections between the subsystems. We illustrate that there is no straightforward relation between the strength of specific connections and specific terms in the decompositions. Furthermore, causal and noncausal statistical dependencies are not separable. In particular, the transfer entropy can be nonmonotonic in dependence on the connectivity strength between subsystems and is also sensitive to internal changes of the subsystems, so it should not be interpreted as a measure of connectivity strength. Altogether, in comparison to an analysis based on single isolated measures of interdependence, this framework is more powerful to analyze emergent properties in multivariate systems and to characterize functionally relevant changes in the dynamics.

  • 16.
    Colding, Johan
    Stockholms universitet.
    Local institutions, biological conservation and management of ecosystem dynamics2001Doktorsavhandling, sammanläggning (Övrigt vetenskapligt)
    Abstract [en]

    This thesis analyze local institutions and management practices related to natural resources and ecosystem dynamics, with an emphasis on "traditional ecological knowledge" systems. Papers I, II and III analyze ‘resource and habitat taboos’ (RHTs) with the objective to synthesize knowledge about informal institutions behind resource management. Papers IV and V focus on resource management practices and social mechanisms with a capacity to confer resilience in ecosystems. Ecological resilience is the buffering capacity of ecosystems to incorporate disturbance and yet continue to provide biodiversity and ecological services critical to societal development. Cases for the synthesis were mainly derived from the literature. Examples of RHTs could be grouped in six different categories depending on their potential management and conservation functions. These included both use-taboos and non-use taboos. The former regulates access to, and methods and withdrawal of subsistence resources. These appear to be closely related to traditional ecological knowledge, as it is defined in this thesis. The latter prohibits human use of species and habitats, and is closely related to religious and cosmological belief systems. As discussed, both groups of taboos can be comparable to ethics of academic conservation biology, although rationales behind such ethics differ. RHTs have effects that may contribute to the conservation of habitats, local subsistence resources, and ‘threatened’, ‘endemic’ and ‘keystone’ species, although some may run contrary to conservation and notions of sustainability. It is asserted that under certain circumstances, RHTs, and possibly other types of informal institutions may offer advantages relative to formal measures of conservation. These benefits include non-costly, voluntary compliance features. Results of papers IV and V revealed that there exists a diversity of traditional practices for ecosystem management. These include multiple species management, resource rotation, ecological monitoring, succession management, landscape patchiness management, and practices of responding to and managing pulses and ecological surprises. Social mechanisms behind these practices included a number of adaptations for the generation, accumulation, and transmission of knowledge; dynamics of institutions; mechanisms for cultural internalization of traditional practices; and the development of appropriate world views and cultural values. These traditional systems had certain similarities to adaptive management with its emphasis on feedback learning, and its treatment of uncertainty and unpredictability to ecosystems. Furthermore, there existed practices that seem to reduce social-ecological crises in the events of large-scale natural disturbance. These included practices that create small-scale ecosystem renewal cycles, practices that spread risks, and practices for nurturing sources of ecosystem renewal. These practices are linked to social mechanisms such as flexible user rights and land tenure. It is concluded that ecological monitoring appears to be a key element in the development of many of the practices. Management practices in local communities are framed by a social context, with informal institutions and other social mechanisms, and supported by a worldview that does not de-couple people from their dependence on natural systems. Since management of ecosystems is associated with uncertainty about their spatial and temporal dynamics and due to incomplete knowledge about such dynamics, these practices may provide useful ‘rules of thumb’ for resource management with an ability to confer resilience and tighten environmental feedbacks of resource exploitation to local levels. To link local institutions in cross-scale polycentric co-management arrangements may be a viable option for improving current resource management systems.

  • 17. Corcoran, Martin M.
    et al.
    Phad, Ganesh E.
    Bernat, Nestor Vazquez
    Stahl-Hennig, Christiane
    Sumida, Noriyuki
    Persson, Mats A. A.
    Martin, Marcel
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik. Stockholms universitet, Science for Life Laboratory (SciLifeLab).
    Hedestam, Gunilla B. Karlsson
    Production of individualized V gene databases reveals high levels of immunoglobulin genetic diversity2016Ingår i: nature communications, ISSN 2041-1723, Vol. 7, artikel-id 13642Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Comprehensive knowledge of immunoglobulin genetics is required to advance our understanding of B cell biology. Validated immunoglobulin variable (V) gene databases are close to completion only for human and mouse. We present a novel computational approach, IgDiscover, that identifies germline V genes from expressed repertoires to a specificity of 100%. IgDiscover uses a cluster identification process to produce candidate sequences that, once filtered, results in individualized germline V gene databases. IgDiscover was tested in multiple species, validated by genomic cloning and cross library comparisons and produces comprehensive gene databases even where limited genomic sequence is available. IgDiscover analysis of the allelic content of the Indian and Chinese-origin rhesus macaques reveals high levels of immunoglobulin gene diversity in this species. Further, we describe a novel human IGHV3-21 allele and confirm significant gene differences between Balb/c and C57BL6 mouse strains, demonstrating the power of IgDiscover as a germline V gene discovery tool.

  • 18.
    Daume, Stefan
    et al.
    Stockholms universitet, Naturvetenskapliga fakulteten, Stockholm Resilience Centre. Georg-August-University Göttingen, Germany; Swedish Museum of Natural History, Sweden.
    Galaz, Victor
    Stockholms universitet, Naturvetenskapliga fakulteten, Stockholm Resilience Centre.
    Anyone Know What Species This Is? - Twitter Conversations as Embryonic Citizen Science Communities2016Ingår i: PLOS ONE, E-ISSN 1932-6203, Vol. 11, nr 3, artikel-id e0151387Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Social media like blogs, micro-blogs or social networks are increasingly being investigated and employed to detect and predict trends for not only social and physical phenomena, but also to capture environmental information. Here we argue that opportunistic biodiversity observations published through Twitter represent one promising and until now unexplored example of such data mining. As we elaborate, it can contribute to real-time information to traditional ecological monitoring programmes including those sourced via citizen science activities. Using Twitter data collected for a generic assessment of social media data in ecological monitoring we investigated a sample of what we denote biodiversity observations with species determination requests (N = 191). These entail images posted as messages on the micro-blog service Twitter. As we show, these frequently trigger conversations leading to taxonomic determinations of those observations. All analysed Tweets were posted with species determination requests, which generated replies for 64% of Tweets, 86% of those contained at least one suggested determination, of which 76% were assessed as correct. All posted observations included or linked to images with the overall image quality categorised as satisfactory or better for 81% of the sample and leading to taxonomic determinations at the species level in 71% of provided determinations. We claim that the original message authors and conversation participants can be viewed as implicit or embryonic citizen science communities which have to offer valuable contributions both as an opportunistic data source in ecological monitoring as well as potential active contributors to citizen science programmes.

  • 19. Didion, John P.
    et al.
    Martin, Marcel
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik. Stockholms universitet, Science for Life Laboratory (SciLifeLab).
    Collins, Francis S.
    Atropos: specific, sensitive, and speedy trimming of sequencing reads2017Ingår i: PeerJ, E-ISSN 2167-8359, Vol. 5, artikel-id e3720Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    A key step in the transformation of raw sequencing reads into biological insights is the trimming of adapter sequences and low-quality bases. Read trimming has been shown to increase the quality and reliability while decreasing the computational requirements of downstream analyses. Many read trimming software tools are available; however, no tool simultaneously provides the accuracy, computational efficiency, and feature set required to handle the types and volumes of data generated in modern sequencing-based experiments. Here we introduce Atropos and show that it trims reads with high sensitivity and specificity while maintaining leadingedge speed. Compared to other state-of-the-art read trimming tools, Atropos achieves significant increases in trimming accuracy while remaining competitive in execution times. Furthermore, Atropos maintains high accuracy even when trimming data with elevated rates of sequencing errors. The accuracy, high performance, and broad feature set offered by Atropos makes it an appropriate choice for the pre-processing of Illumina, ABI SOLiD, and other current-generation short-read sequencing datasets. Atropos is open source and free software written in Python (3.3+) and available at https://github. com/jdidion/atropos.

  • 20. Ekim, Baris
    et al.
    Sahlin, Kristoffer
    Stockholms universitet, Naturvetenskapliga fakulteten, Matematiska institutionen. Stockholms universitet, Science for Life Laboratory (SciLifeLab).
    Medvedev, Paul
    Berger, Bonnie
    Chikhi, Rayan
    Efficient mapping of accurate long reads in minimizer space with mapquik2023Ingår i: Genome Research, ISSN 1088-9051, E-ISSN 1549-5469, Vol. 33, nr 7, s. 1188-1197Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    DNA sequencing data continue to progress toward longer reads with increasingly lower sequencing error rates. We focus on the critical problem of mapping, or aligning, low-divergence sequences from long reads (e.g., Pacific Biosciences [PacBio] HiFi) to a reference genome, which poses challenges in terms of accuracy and computational resources when using cutting-edge read mapping approaches that are designed for all types of alignments. A natural idea would be to optimize efficiency with longer seeds to reduce the probability of extraneous matches; however, contiguous exact seeds quickly reach a sensitivity limit. We introduce mapquik, a novel strategy that creates accurate longer seeds by anchoring alignments through matches of k consecutively sampled minimizers (k-min-mers) and only indexing k-min-mers that occur once in the reference genome, thereby unlocking ultrafast mapping while retaining high sensitivity. We show that mapquik significantly accelerates the seeding and chaining steps-fundamental bottlenecks to read mapping-for both the human and maize genomes with >96% sensitivity and near-perfect specificity. On the human genome, for both real and simulated reads, mapquik achieves a 37x speedup over the state-of-the-art tool minimap2, and on the maize genome, mapquik achieves a 410x speedup over minimap2, making mapquik the fastest mapper to date. These accelerations are enabled from not only minimizer-space seeding but also a novel heuristic O(n) pseudochaining algorithm, which improves upon the long-standing O(nlogn) bound. Minimizer-space computation builds the foundation for achieving real-time analysis of long-read sequencing data.

  • 21.
    Forslund, Kristoffer
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik.
    The relationship between orthology, protein domain architecture and protein function2011Doktorsavhandling, sammanläggning (Övrigt vetenskapligt)
    Abstract [en]

    Lacking experimental data, protein function is often predicted from evolutionary and protein structure theory. Under the 'domain grammar' hypothesis the function of a protein follows from the domains it encodes. Under the 'orthology conjecture', orthologs, related through species formation, are expected to be more functionally similar than paralogs, which are homologs in the same or different species descended from a gene duplication event. However, these assumptions have not thus far been systematically evaluated.

    To test the 'domain grammar' hypothesis, we built models for predicting function from the domain combinations present in a protein, and demonstrated that multi-domain combinations imply functions that the individual domains do not. We also developed a novel gene-tree based method for reconstructing the evolutionary histories of domain architectures, to search for cases of architectures that have arisen multiple times in parallel, and found this to be more common than previously reported.

    To test the 'orthology conjecture', we first benchmarked methods for homology inference under the obfuscating influence of low-complexity regions, in order to improve the InParanoid orthology inference algorithm. InParanoid was then used to test the relative conservation of functionally relevant properties between orthologs and paralogs at various evolutionary distances, including intron positions, domain architectures, and Gene Ontology functional annotations.

    We found an increased conservation of domain architectures in orthologs relative to paralogs, in support of the 'orthology conjecture' and the 'domain grammar' hypotheses acting in tandem. However, equivalent analysis of Gene Ontology functional conservation yielded spurious results, which may be an artifact of species-specific annotation biases in functional annotation databases. I discuss possible ways of circumventing this bias so the 'orthology conjecture' can be tested more conclusively.

    Ladda ner fulltext (pdf)
    fulltext
  • 22.
    Forslund, Kristoffer
    et al.
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik.
    Henricson, Anna
    Hollich, Volker
    Sonnhammer, Erik L.L.
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik.
    Domain tree-based analysis of protein architecture evolution2008Ingår i: Molecular biology and evolution, ISSN 0737-4038, E-ISSN 1537-1719, Vol. 25, nr 2, s. 254-264Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Understanding the dynamics behind domain architecture evolution is of great importance to unravel the functions of proteins. Complex architectures have been created throughout evolution by rearrangement and duplication events. An interesting question is how many times a particular architecture has been created, a form of convergent evolution or domain architecture reinvention. Previous studies have approached this issue by comparing architectures found in different species. We wanted to achieve a finer-grained analysis by reconstructing protein architectures on complete domain trees. The prevalence of domain architecture reinvention in 96 genomes was investigated with a novel domain tree-based method that uses maximum parsimony for inferring ancestral protein architectures. Domain architectures were taken from Pfam. To ensure robustness, we applied the method to bootstrap trees and only considered results with strong statistical support. We detected multiple origins for 12.4% of the scored architectures. In a much smaller data set, the subset of completely domain-assigned proteins, the figure was 5.6%. These results indicate that domain architecture reinvention is a much more common phenomenon than previously thought. We also determined which domains are most frequent in multiply created architectures and assessed whether specific functions could be attributed to them. However, no strong functional bias was found in architectures with multiple origins.

  • 23.
    Forslund, Kristoffer
    et al.
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik.
    Pekkari, Isabella
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik.
    Sonnhammer, Erik L. L.
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik.
    Domain architecture conservation in orthologs2011Ingår i: BMC Bioinformatics, E-ISSN 1471-2105, Vol. 12, s. 326-Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Background. As orthologous proteins are expected to retain function more often than other homologs, they are often used for functional annotation transfer between species. However, ortholog identification methods do not take into account changes in domain architecture, which are likely to modify a protein's function. By domain architecture we refer to the sequential arrangement of domains along a protein sequence. To assess the level of domain architecture conservation among orthologs, we carried out a large-scale study of such events between human and 40 other species spanning the entire evolutionary range. We designed a score to measure domain architecture similarity and used it to analyze differences in domain architecture conservation between orthologs and paralogs relative to the conservation of primary sequence. We also statistically characterized the extents of different types of domain swapping events across pairs of orthologs and paralogs.

    Results. The analysis shows that orthologs exhibit greater domain architecture conservation than paralogous homologs, even when differences in average sequence divergence are compensated for, for homologs that have diverged beyond a certain threshold. We interpret this as an indication of a stronger selective pressure on orthologs than paralogs to retain the domain architecture required for the proteins to perform a specific function. In general, orthologs as well as the closest paralogous homologs have very similar domain architectures, even at large evolutionary separation. The most common domain architecture changes observed in both ortholog and paralog pairs involved insertion/deletion of new domains, while domain shuffling and segment duplication/deletion were very infrequent.

    Conclusions. On the whole, our results support the hypothesis that function conservation between orthologs demands higher domain architecture conservation than other types of homologs, relative to primary sequence conservation. This supports the notion that orthologs are functionally more similar than other types of homologs at the same evolutionary distance.

  • 24.
    Forslund, Kristoffer
    et al.
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik.
    Sonnhammer, Erik L. L.
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik. Swedish e-Science Research Center .
    Evolution of Protein Domain Architectures2012Ingår i: Evolutionary Genomics: Statistical and Computational Methods, Vol 2 / [ed] Anisimova, M, Totowa, NJ: Humana Press, 2012, s. 187-216Kapitel i bok, del av antologi (Refereegranskat)
    Abstract [en]

    This chapter reviews the current research on how protein domain architectures evolve. We begin by summarizing work on the phylogenetic distribution of proteins, as this directly impacts which domain architectures can be formed in different species. Studies relating domain family size to occurrence have shown that they generally follow power law distributions, both within genomes and larger evolutionary groups. These findings were subsequently extended to multidomain architectures. Genome evolution models that have been suggested to explain the shape of these distributions arc reviewed, as well as evidence for selective pressure to expand certain domain families more than others. Each domain has an intrinsic combinatorial propensity, and the effects of this have been studied using measures of domain versatility or promiscuity. Next, we study the principles of protein domain architecture evolution and how these have been inferred from distributions of extant domain arrangements. Following this, we review inferences of ancestral domain architecture and the conclusions concerning domain architecture evolution mechanisms that can be drawn from these. Finally, we examine whether all known cases of a given domain architecture can be assumed to have a single common origin (monophyly) or have evolved convergently (polyphyly).

  • 25.
    Forslund, Kristoffer
    et al.
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik.
    Sonnhammer, Erik L.L.
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik.
    Predicting protein function from domain content2008Ingår i: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 24, nr 15, s. 1681-1687Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    MOTIVATION: Computational assignment of protein function may be the single most vital application of bioinformatics in the post-genome era. These assignments are made based on various protein features, where one is the presence of identifiable domains. The relationship between protein domain content and function is important to investigate, to understand how domain combinations encode complex functions.

    RESULTS: Two different models are presented on how protein domain combinations yield specific functions: one rule-based and one probabilistic. We demonstrate how these are useful for Gene Ontology annotation transfer. The first is an intuitive generalization of the Pfam2GO mapping, and detects cases of strict functional implications of sets of domains. The second uses a probabilistic model to represent the relationship between domain content and annotation terms, and was found to be better suited for incomplete training sets. We implemented these models as predictors of Gene Ontology functional annotation terms. Both predictors were more accurate than conventional best BLAST-hit annotation transfer and more sensitive than a single-domain model on a large-scale dataset. We present a number of cases where combinations of Pfam-A protein domains predict functional terms that do not follow from the individual domains.

    AVAILABILITY: Scripts and documentation are available for download at http://sonnhammer.sbc.su.se/multipfam2go_source_docs.tar

  • 26.
    Frings, Oliver
    et al.
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik. Stockholms universitet, Science for Life Laboratory (SciLifeLab).
    Alexeyenko, Andrey
    Sonnhammer, Erik L. L.
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik. Stockholms universitet, Science for Life Laboratory (SciLifeLab).
    MGclus: network clustering employing shared neighbors2013Ingår i: Molecular BioSystems, ISSN 1742-206X, Vol. 9, nr 7, s. 1670-1675Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Network analysis is an important tool for functional annotation of genes and proteins. A common approach to discern structure in a global network is to infer network clusters, or modules, and assume a functional coherence within each module, which may represent a complex or a pathway. It is however not trivial to define optimal modules. Although many methods have been proposed, it is unclear which methods perform best in general. It seems that most methods produce far from optimal results but in different ways. MGclus is a new algorithm designed to detect modules with a strongly interconnected neighborhood in large scale biological interaction networks. In our benchmarks we found MGclus to outperform other methods when applied to random graphs with varying degree of noise, and to perform equally or better when applied to biological protein interaction networks. MGclus is implemented in Java and utilizes the JGraphT graph library. It has an easy to use command-line interface and is available for download from http://sonnhammer.sbc.su.se/download/software/MGclus/.

    Ladda ner fulltext (pdf)
    fulltext
  • 27.
    Frings, Oliver
    et al.
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik. Stockholms universitet, Science for Life Laboratory (SciLifeLab).
    Mank, Judith E.
    Alexeyenko, Andrey
    Sonnhammer, Erik L. L.
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik. Stockholms universitet, Science for Life Laboratory (SciLifeLab).
    Network Analysis of Functional Genomics Data: Application to Avian Sex-Biased Gene Expression2012Ingår i: Scientific World Journal, E-ISSN 1537-744X, s. 130491-Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Gene expression analysis is often used to investigate the molecular and functional underpinnings of a phenotype. However, differential expression of individual genes is limited in that it does not consider how the genes interact with each other in networks. To address this shortcoming we propose a number of network-based analyses that give additional functional insights into the studied process. These were applied to a dataset of sex-specific gene expression in the chicken gonad and brain at different developmental stages. We first constructed a global chicken interaction network. Combining the network with the expression data showed that most sex-biased genes tend to have lower network connectivity, that is, act within local network environments, although some interesting exceptions were found. Genes of the same sex bias were generally more strongly connected with each other than expected. We further studied the fates of duplicated sex-biased genes and found that there is a significant trend to keep the same pattern of sex bias after duplication. We also identified sex-biased modules in the network, which reveal pathways or complexes involved in sex-specific processes. Altogether, this work integrates evolutionary genomics with systems biology in a novel way, offering new insights into the modular nature of sex-biased genes.

  • 28.
    Guala, Dimitri
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik. Stockholm, Bioinformatics Center, Science for Life Laboratory.
    Functional association networks for disease gene prediction2017Doktorsavhandling, sammanläggning (Övrigt vetenskapligt)
    Abstract [en]

    Mapping of the human genome has been instrumental in understanding diseasescaused by changes in single genes. However, disease mechanisms involvingmultiple genes have proven to be much more elusive. Their complexityemerges from interactions of intracellular molecules and makes them immuneto the traditional reductionist approach. Only by modelling this complexinteraction pattern using networks is it possible to understand the emergentproperties that give rise to diseases.The overarching term used to describe both physical and indirect interactionsinvolved in the same functions is functional association. FunCoup is oneof the most comprehensive networks of functional association. It uses a naïveBayesian approach to integrate high-throughput experimental evidence of intracellularinteractions in humans and multiple model organisms. In the firstupdate, both the coverage and the quality of the interactions, were increasedand a feature for comparing interactions across species was added. The latestupdate involved a complete overhaul of all data sources, including a refinementof the training data and addition of new class and sources of interactionsas well as six new species.Disease-specific changes in genes can be identified using high-throughputgenome-wide studies of patients and healthy individuals. To understand theunderlying mechanisms that produce these changes, they can be mapped tocollections of genes with known functions, such as pathways. BinoX wasdeveloped to map altered genes to pathways using the topology of FunCoup.This approach combined with a new random model for comparison enables BinoXto outperform traditional gene-overlap-based methods and other networkbasedtechniques.Results from high-throughput experiments are challenged by noise and biases,resulting in many false positives. Statistical attempts to correct for thesechallenges have led to a reduction in coverage. Both limitations can be remediedusing prioritisation tools such as MaxLink, which ranks genes using guiltby association in the context of a functional association network. MaxLink’salgorithm was generalised to work with any disease phenotype and its statisticalfoundation was strengthened. MaxLink’s predictions were validatedexperimentally using FRET.The availability of prioritisation tools without an appropriate way to comparethem makes it difficult to select the correct tool for a problem domain.A benchmark to assess performance of prioritisation tools in terms of theirability to generalise to new data was developed. FunCoup was used for prioritisationwhile testing was done using cross-validation of terms derived fromGene Ontology. This resulted in a robust and unbiased benchmark for evaluationof current and future prioritisation tools. Surprisingly, previously superiortools based on global network structure were shown to be inferior to a localnetwork-based tool when performance was analysed on the most relevant partof the output, i.e. the top ranked genes.This thesis demonstrates how a network that models the intricate biologyof the cell can contribute with valuable insights for researchers that study diseaseswith complex genetic origins. The developed tools will help the researchcommunity to understand the underlying causes of such diseases and discovernew treatment targets. The robust way to benchmark such tools will help researchersto select the proper tool for their problem domain.

    Ladda ner fulltext (pdf)
    Functional association networks for disease gene prediction
    Ladda ner (jpg)
    Omslagsframsida
  • 29.
    Guala, Dimitri
    et al.
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik. Stockholms universitet, Science for Life Laboratory (SciLifeLab).
    Bernhem, Kristoffer
    Ait Blal, Hammou
    Lundberg, Emma
    Brismar, Hjalmar
    Sonnhammer, Erik L. L.
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik. Stockholms universitet, Science for Life Laboratory (SciLifeLab).
    Experimental validation of predicted cancer genes using FRETManuskript (preprint) (Övrigt vetenskapligt)
  • 30.
    Guala, Dimitri
    et al.
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik. Stockholm Bioinformatics Centre, Sweden.
    Sjölund, Erik
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik. Stockholm Bioinformatics Centre, Sweden.
    Sonnhammer, Erik L. L.
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik. Stockholm Bioinformatics Centre, Sweden; Swedish eScience Research Center, Sweden.
    MaxLink: network-based prioritization of genes tightly linked to a disease seed set2014Ingår i: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 30, nr 18, s. 2689-2690Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    A Summary: MaxLink, a guilt-by-association network search algorithm, has been made available as a web resource and a stand-alone version. Based on a user-supplied list of query genes, MaxLink identifies and ranks genes that are tightly linked to the query list. This functionality can be used to predict potential disease genes from an initial set of genes with known association to a disease. The original algorithm, used to identify and rank novel genes potentially involved in cancer, has been updated to use a more statistically sound method for selection of candidate genes and made applicable to other areas than cancer. The algorithm has also been made faster by re-implementation in C + +, and the Web site uses FunCoup 3.0 as the underlying network.

  • 31.
    Guala, Dimitri
    et al.
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik. Stockholms universitet, Science for Life Laboratory (SciLifeLab).
    Sonnhammer, Erik L. L.
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik. Stockholms universitet, Science for Life Laboratory (SciLifeLab).
    A large-scale benchmark of gene prioritization methods2017Ingår i: Scientific Reports, E-ISSN 2045-2322, Vol. 7, artikel-id 46598Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    In order to maximize the use of results from high-throughput experimental studies, e.g. GWAS, for identification and diagnostics of new disease-associated genes, it is important to have properly analyzed and benchmarked gene prioritization tools. While prospective benchmarks are underpowered to provide statistically significant results in their attempt to differentiate the performance of gene prioritization tools, a strategy for retrospective benchmarking has been missing, and new tools usually only provide internal validations. The Gene Ontology (GO) contains genes clustered around annotation terms. This intrinsic property of GO can be utilized in construction of robust benchmarks, objective to the problem domain. We demonstrate how this can be achieved for network-based gene prioritization tools, utilizing the FunCoup network. We use cross-validation and a set of appropriate performance measures to compare state-of-the-art gene prioritization algorithms: three based on network diffusion, NetRank and two implementations of Random Walk with Restart, and MaxLink that utilizes network neighborhood. Our benchmark suite provides a systematic and objective way to compare the multitude of available and future gene prioritization tools, enabling researchers to select the best gene prioritization tool for the task at hand, and helping to guide the development of more accurate methods.

  • 32.
    Hedlund, Johanna
    et al.
    Stockholms universitet, Naturvetenskapliga fakulteten, Stockholm Resilience Centre.
    Bodin, Örjan
    Stockholms universitet, Naturvetenskapliga fakulteten, Stockholm Resilience Centre.
    Nohrstedt, Daniel
    Policy issue interdependency and the formation of collaborative networks2021Ingår i: People and Nature, E-ISSN 2575-8314, Vol. 3, nr 1, s. 236-250Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    1. Environmental problems often span a set of challenges that each may engage different policy actors across different policy domains. These challenges, or policy issues, nonetheless exhibit interdependencies that may constrain the ability of actors to work together towards joint solutions.

    2. Still, we have limited knowledge about whether and how policy issue interdependencies actually shape how actors collaborate.

    3. Using data derived from two venues for collaborative water governance in the Norrstrom basin, Sweden, we investigate whether and how policy issues and policy issue interdependencies influence actors' selection of collaborative partners. We test two alternative sets of propositions; one set assumes that partner selection is driven by actors' engagement in policy issues and their interdependencies, while the other set emphasises social positions and actor attributes.

    4. Our results show that in one venue, actors' choices of collaborative partner were associated with factors from both sets, but not with policy issue interdependencies specifically. In the other venue, only actor and relational attributes shaped social tie formation. These results suggest that how actors interact does not necessarily align with the policy issues and the policy issue interdependencies defined by the environmental problem they are to address.

    5. Our results provide an important step towards arriving at evidence-based recommendations for more effective collaborative efforts in addressing complex environmental problems that no actor can address alone

  • 33.
    Hennerdal, Aron
    et al.
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik.
    Elofsson, Arne
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik.
    Rapid membrane protein topology prediction2011Ingår i: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 27, nr 9, s. 1322-1323Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    State-of-the-art methods for topology of α-helical membrane proteins are based on the use of time-consuming multiple sequence alignments obtained from PSI-BLAST or other sources. Here, we examine if it is possible to use the consensus of topology prediction methods that are based on single sequences to obtain a similar accuracy as the more accurate multiple sequence-based methods. Here, we show that TOPCONS-single performs better than any of the other topology prediction methods tested here, but ~6% worse than the best method that is utilizing multiple sequence alignments. AVAILABILITY AND IMPLEMENTATION: TOPCONS-single is available as a web server from http://single.topcons.net/ and is also included for local installation from the web site. In addition, consensus-based topology predictions for the entire international protein index (IPI) is available from the web server and will be updated at regular intervals.

    Ladda ner fulltext (pdf)
    Fulltext
  • 34.
    Hennerdal, Aron
    et al.
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik.
    Tsirigos, Konstantinos
    A guideline to α-helical membrane protein topology predictionManuskript (preprint) (Övrigt vetenskapligt)
    Abstract [en]

    All living organisms have a “membrane proteome” that mainly consists of α-helical mem- brane proteins containing one or more TM-helices. Prediction methods have been extensively used to identify as well as to classify the topology of these proteins. For current state-of-the- art methods, the prediction of correct topology of membrane proteins has been reported to be above 80%. However, this performance has only been observed in small and possibly biased datasets. Here, we add four “genome-scale” datasets, including a recent large set of experimen- tally validated membrane proteins with glycosylation sites. This set is also used to examine whether the qualities of topology predictions hold and if any prediction methods perform con- sistently better than others. We find that methods utilizing multiple sequence alignments are overall superior to methods that do not. The best performance is obtained by TOPCONS, a consensus method which combines several of the other prediction methods. Further, we show that the accuracy is most likely lower in eukaryotes than for prokaryotic proteins as the agree- ment between the predictors is significantly lower there. Finally, we show that three related methods, Phobius, Phillius and PolyPhobius, that incorporate a specific signal peptide module are superior to all other methods at the task of distinguishing between membrane and non- membrane proteins.

  • 35.
    Henricson, Anna
    et al.
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik.
    Forslund, Kristoffer
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik.
    Sonnhammer, Erik L. L.
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik.
    Orthology confers intron position conservation2010Ingår i: BMC Genomics, E-ISSN 1471-2164, Vol. 11:412Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Background: With the wealth of genomic data available it has become increasingly important to assign putative protein function through functional transfer between orthologs. Therefore, correct elucidation of the evolutionary relationships among genes is a critical task, and attempts should be made to further improve the phylogenetic inference by adding relevant discriminating features. It has been shown that introns can maintain their position over long evolutionary timescales. For this reason, it could be possible to use conservation of intron positions as a discriminating factor when assigning orthology. Therefore, we wanted to investigate whether orthologs have a higher degree of intron position conservation (IPC) compared to non-orthologous sequences that are equally similar in sequence.

    Results: To this end, we developed a new score for IPC and applied it to ortholog groups between human and six other species. For comparison, we also gathered the closest non-orthologs, meaning sequences close in sequence space, yet falling just outside the ortholog cluster. We found that ortholog-ortholog gene pairs on average have a significantly higher degree of IPC compared to ortholog-closest non-ortholog pairs. Also pairs of inparalogs were found to have a higher IPC score than inparalog-closest non-inparalog pairs. We verified that these differences can not simply be attributed to the generally higher sequence identity of the ortholog-ortholog and the inparalog-inparalog pairs. Furthermore, we analyzed the agreement between IPC score and the ortholog score assigned by the InParanoid algorithm, and found that it was consistently high for all species comparisons. In a minority of cases, the IPC and InParanoid score ranked inparalogs differently. These represent cases where sequence and intron position divergence are discordant. We further analyzed the discordant clusters to identify any possible preference for protein functions by looking for enriched GO terms and Pfam protein domains. They were enriched for functions important for multicellularity, which implies a connection between shifts in intronic structure and the origin of multicellularity.

    Conclusions: We conclude that orthologous genes tend to have more conserved intron positions compared to non-orthologous genes. As a consequence, our IPC score is useful as an additional discriminating factor when assigning orthology.

  • 36.
    Herman, Pawel Andrzej
    et al.
    Stockholms universitet, Naturvetenskapliga fakulteten, Numerisk analys och datalogi (NADA). Royal Institute of Technology, Sweden.
    Lundqvist, Mikael
    Stockholms universitet, Naturvetenskapliga fakulteten, Numerisk analys och datalogi (NADA). Royal Institute of Technology, Sweden.
    Lansner, Anders
    Stockholms universitet, Naturvetenskapliga fakulteten, Numerisk analys och datalogi (NADA). Royal Institute of Technology, Sweden.
    Nested theta to gamma oscillations and precise spatiotemporal firing during memory retrieval in a simulated attractor network2013Ingår i: Brain Research, ISSN 0006-8993, E-ISSN 1872-6240, Vol. 1536, nr S1, s. 68-87Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Nested oscillations, where the phase of the underlying slow rhythm modulates the power of faster oscillations, have recently attracted considerable research attention as the increased phase-coupling of cross-frequency oscillations has been shown to relate to memory processes. Here we investigate the hypothesis that reactivations of memory patterns, induced by either external stimuli or internal dynamics, are manifested as distributed cell assemblies oscillating at gamma-like frequencies with life-times on a theta scale. For this purpose, we study the spatiotemporal oscillatory dynamics of a previously developed meso-scale attractor network model as a correlate of its memory function. The focus is on a hierarchical nested organization of neural oscillations in delta/theta (2–5 Hz) and gamma frequency bands (25–35 Hz), and in some conditions even in lower alpha band (8–12 Hz), which emerge in the synthesized field potentials during attractor memory retrieval. We also examine spiking behavior of the network in close relation to oscillations. Despite highly irregular firing during memory retrieval and random connectivity within each cell assembly, we observe precise spatiotemporal firing patterns that repeat across memory activations at a rate higher than expected from random firing. In contrast to earlier studies aimed at modeling neural oscillations, our attractor memory network allows us to elaborate on the functional context of emerging rhythms and discuss their relevance. We provide support for the hypothesis that the dynamics of coherent delta/theta oscillations constitute an important aspect of the formation and replay of neuronal assemblies.

  • 37.
    Hillerton, Thomas
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik.
    In silico modelling for refining gene regulatory network inference2023Doktorsavhandling, sammanläggning (Övrigt vetenskapligt)
    Abstract [en]

    Gene regulation is at the centre of all cellular functions, regulating the cell's healthy and pathological responses. The interconnected system of regulatory interactions is known as the gene regulatory network (GRN), where genes influence each other to maintain strict and robust control. Today a large number of methods exist for inferring GRNs, which necessitates benchmarking to determine which method is most suitable for a specific goal. Paper I presents such a benchmark focusing on the effect of using known perturbations to infer GRNs. 

    A further challenge when studying GRNs is that experimental data contains high levels of noise and that artefacts may be introduced by the experiment itself. The LSCON method was developed in paper II to reduce the effect of one such artefact that can occur if the expression of a gene shows no or minimal change across most or all experiments. 

     With few fully determined biological GRNs available, it is problematic to use these to evaluate an inference method's correctness. Instead, the GRN field relies on simulated data, using a known GRN and generating the corresponding data. When simulating GRNs, capturing the topological properties of the biological GRN is vital. The FFLatt algorithm was developed in paper III to create scale-free, feed-forward loop motif-enriched GRNs, capturing two of the most prominent topological features in biological GRNs. 

     Once a high-quality GRN is obtained, the next step is to simulate gene expression data corresponding to the GRN. In paper IV, building on the FFLatt method, an open-source Python simulation tool called GeneSNAKE was developed to generate expression data for benchmarking purposes. GeneSNAKE allows the user to control a wide range of network and data properties and improves on previous tools by featuring a variety of perturbation schemes along with the ability to control noise and modify the perturbation strength.

    Ladda ner fulltext (pdf)
    In silico modelling for refining gene regulatory network inference
    Ladda ner (jpg)
    presentationsbild
  • 38.
    Hillerton, Thomas
    et al.
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik.
    Erik K., Zhivkoplias
    Stockholms universitet, Naturvetenskapliga fakulteten, Stockholm Resilience Centre.
    Garbulowski, Mateusz
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik.
    Sonnhammer, Erik L. L.
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik.
    GeneSNAKE: a Python package for benchmarking and simulation of gene regulatory networks and expression data.Manuskript (preprint) (Övrigt vetenskapligt)
    Abstract [en]

    Understanding how genes interact with and regulate each other is a key challenge in systems biology. One of the primary methods to study this is through gene regulatory networks (GRNs). The field of GRN inference however faces many challenges, such as the complexity of gene regulation and high noise levels, which necessitates effective tools for evaluating inference methods. For this purpose, data that corresponds to a known GRN, from various conditions and experimental setups is necessary, which is only possible to attain via simulation.  Existing tools for simulating data for GRN inference have limitations either in the way networks are constructed or data is produced, and are often not flexible for adjusting the algorithm or parameters. 

    To overcome these issues we present GeneSNAKE, a Python package designed to allow users to generate biologically realistic GRNs, and from a GRN simulate expression data for benchmarking purposes. GeneSNAKE allows the user to control a wide range of network and data properties. GeneSNAKE improves on previous work in the field by adding a perturbation model that allows for a greater range of perturbation schemes along with the ability to control noise and modify the perturbation strength. 

    For benchmarking, GeneSNAKE offers a number of functions both for comparing a true GRN to an inferred GRN, and to study properties in data and GRN models. These functions can in addition be used to study properties of biological data to produce simulated data with more realistic properties.  GeneSNAKE is an open-source, comprehensive simulation and benchmarking package with powerful capabilities that are not combined in any other single package, and thanks to the Python implementation it is simple to extend and modify by a user.

  • 39.
    Hosseini Ashtiani, Saman
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik.
    Omics Data Analysis of Complex Diseases and Traits2022Doktorsavhandling, sammanläggning (Övrigt vetenskapligt)
    Abstract [en]

    Following the advent of the high-throughput techniques for producing massive omics data, new possibilities and challenges have also emerged in different fields of biology and medicine. Dealing with such data on different scales with different scopes such as genomics, transcriptomics, proteomics and metabolomics, demands appropriate data collection, preprocessing, statistical analysis, interpretation and visualization. The overall goal of this thesis was to conceive omics-related questions in the context of four research titles and to apply a rational choice of the mentioned methods to conduct the study plans to answer them. 

    Paper I asks whether we could propose potentially implicated genes in psoriasis; and tries to answer it using microarray transcriptomics data of psoriasis. Initially, quality control was performed on the microarray dataset and then the Differentially Expressed Genes (DEGs) were chosen for mapping to a protein-protein interaction (PPI) database to create a subnetwork of the respective PPI. Using network analysis, genes with higher scores were proposed as potentially relevant to psoriasis and finally, we evaluated the results concerning a gene-disease association database. 

    Paper II asks whether the knockout of two genes followed by a transformation in E. coli could lead to an increase in bacterial growth in two different media; and deals with it through in vitro experiments followed by an in silico analysis of E. coli RNA-seq data. Here, we calculated the pairwise correlations between each target (knockout) gene and the rest of the genes in the RNA-seq dataset. Then, the significantly anti-correlated genes were shown to mainly belong to protein biosynthesis pathways compared to all other background pathways, which might indicate an increase in protein biosynthesis-related genes' transcription levels when there is an absolute decrease (knockout) in each of the target genes. 

    Paper III asks if an anti-bone-resorption drug called Denosumab significantly affects the abundance of the metabolites extracted from blood samples during a two-year longitudinal placebo-controlled clinical trial study; and tries to address this through running statistical hypothesis testing for each metabolite in the quantification data from Liquid Chromatography-Mass Spectrometry (LC-MS). Afterwards, the patterns of metabolites' variations concerning Denosumab administration and visit times were studied using Principal Component Analysis (PCA), association studies and Hierarchical clustering. The results of this study proposed some identified metabolites for further clinical investigations. Based on our analyses, the patterns of abundance variations in some of the identified metabolites could be considered for improving the corresponding clinical studies and treatment with Denosumab. 

    Paper IV proposes potentially relevant genes in lung adenocarcinoma by constructing a genome-scale co-expression network followed by clustering. The genes in each cluster were studied using the literature knowledge. One of the most frequently reported genes in lung adenocarcinoma was EGFR. We reported all the first-neighborhood genes connected to EFGR in its corresponding module as potentially relevant to lung adenocarcinoma. 

    The repertoire of the above choices, workflows and evaluations could be applicable for further follow-up studies at different levels including omics data integration, personalized omics data analysis, studies on different scales such as cellular or tissue, using other methodologies for the same questions and running benchmarks. Although four different omics-related questions were posed in this thesis, they all involved the selection or preparation of the respective omics data, choosing preprocessing strategies, choosing statistical analyses and hypothesis testing methods and finally, performing the evaluation of the results and interpretations.

    Ladda ner fulltext (pdf)
    Omics Data Analysis of Complex Diseases and Traits
    Ladda ner (jpg)
    presentationsbild
  • 40.
    Hosseini Ashtiani, Saman
    et al.
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik.
    Razavipour, Roya
    Akhavan Sepahi, Abbas
    Mohammad Hossein, Modarresi
    Elofsson, Arne
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik.
    Bambai, Bijan
    FDH knockout and TsFDH transformation led to enhanced growth rate of Escherichia coliManuskript (preprint) (Övrigt vetenskapligt)
    Abstract [en]

    Background: 

    Increased Atmospheric CO2 to over 400 ppm has prompted global climate irregularities. Reducing the released CO2 from biotechnological processes could remediate these phenomena. In this study, we sought to reduce the released CO2 into the atmosphere from bacterial growth by reducing formic acid conversion into CO2. Since E. coli is the biotechnological workhorse and its higher growth rate is desirable, another goal was to monitor the bacterial biomass after the metabolic engineering. 

    Results: 

    The biochemical conversion of formic acid to CO2 is a key reaction. Therefore, we compared the growth of control strains K12 and BL21, alongside two strains (in which two different genes coding two formate dehydrogenase (FDH) subunits were deleted) in complex and simple media. Our observations demonstrated that the knockout bacteria significantly grew more efficiently than the controls in both media. TsFDH, an FDH with moderately more catalytic efficiency, in contrast to other known FDHs for converting CO2 to formate, increased the growth of both knockouts compared with the controls and the knockouts without TsFDH. This difference was more accentuated in M9+Glycerol. Through a transcriptomics-level in silico analysis of the knockout genes, RNA-seq-based correlation outcome revealed that the genes negatively correlated with the target genes (knockout genes) belong to tRNA-related pathways. 

    Conclusion: 

    Observing higher cell biomass for the knockout and transformed strains at equal concentrations of carbon source in both media indicates possible underlying mechanisms leading to reduced carbon leakage and increased carbon assimilation, which need more detailed investigations. These results may also provide a phenotypic-level clue for the inconsistency of predictions in previous metabolic models that declared glycerol as a suitable carbon source for the growth of E. coli but failed to achieve it in practice. Gene expression correlations and pathway analysis outcomes suggested possible over-expression of the genes involved in tRNA processing and charging pathways. 

  • 41. Hu, Rui-Si
    et al.
    Zhang, Xiao-Xuan
    Ma, Qiao-Ni
    Elsheikha, Hany M.
    Ehsan, Muhammad
    Zhao, Quan
    Fromm, Bastian
    Stockholms universitet, Science for Life Laboratory (SciLifeLab). Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för molekylär biovetenskap, Wenner-Grens institut.
    Zhu, Xing-Quan
    Differential expression of microRNAs and tRNA fragments mediate the adaptation of the liver fluke Fasciola gigantica to its intermediate snail and definitive mammalian hosts2021Ingår i: International Journal of Parasitology, ISSN 0020-7519, E-ISSN 1879-0135, Vol. 51, nr 5, s. 405-414Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    The tropical liver fluke Fasciola gigantica affects livestock and humans in many Asian countries, large parts of Africa, and parts of Europe. Despite the public health and economic impacts of F. gigantica, understanding of F. gigantica biology and how the complex lifecycle of this liver fluke is transcriptionally regulated remain unknown. Here, we tested the hypothesis that the regulatory small non-coding RNAs (sncRNAs), microRNAs (miRNAs) and tRNA-derived fragments (tRFs) play roles in the adaptation of F. gigantica to its intermediate and definitive hosts. We sequenced sncRNAs of eight lifecycle stages of F. gigantica. In total, 56 miRNAs from 33 conserved families and four Fasciola-specific miRNAs were identified. Expression analysis of miRNAs suggested clear stage-related patterns. By leveraging the existing transcriptomic data, we predicted a miRNA-based regulation of metabolism, transport, growth and developmental processes. Also, by comparing miRNA complement of F. gigantica with that of Fasciola hepatica, we detected a high level of conservation and identified differences in some miRNAs, which can be used to distinguish the two species. Moreover, we found that tRFs at each lifecycle stage were predominantly derived by tRNA-Lys and tRNA-Gly at 50 half sites, but relatively high expression was related to the buffalo-infecting stages. Taken together, we provided a comprehensive overview of the dynamic transcriptional changes of small RNAs that occur during the developmental stages of F. gigantica. This global analysis of F. gigantica lifecycle stages revealed new roles of miRNAs and tRFs in parasite development and will facilitate future research into understanding of fasciolosis pathobiology.

  • 42. Höjer, Pontus
    et al.
    Frick, Tobias
    Siga, Humam
    Pourbozorgi, Parham
    Aghelpasand, Hooman
    Martin, Marcel
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik. Stockholms universitet, Science for Life Laboratory (SciLifeLab).
    Ahmadian, Afshin
    BLR: a flexible pipeline for haplotype analysis of multiple linked-read technologies2023Ingår i: Nucleic Acids Research, ISSN 0305-1048, E-ISSN 1362-4962, Vol. 51, nr 22, artikel-id e114Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Linked-read sequencing promises a one-method approach for genome-wide insights including single nucleotide variants (SNVs), structural variants, and haplotyping. We introduce Barcode Linked Reads (BLR), an open-source haplotyping pipeline capable of handling millions of barcodes and data from multiple linked-read technologies including DBS, 10× Genomics, TELL-seq and stLFR. Running BLR on DBS linked-reads yielded megabase-scale phasing with low (<0.2%) switch error rates. Of 13616 protein-coding genes phased in the GIAB benchmark set (v4.2.1), 98.6% matched the BLR phasing. In addition, large structural variants showed concordance with HPRC-HG002 reference assembly calls. Compared to diploid assembly with PacBio HiFi reads, BLR phasing was more continuous when considering switch errors. We further show that integrating long reads at low coverage (∼10×) can improve phasing contiguity and reduce switch errors in tandem repeats. When compared to Long Ranger on 10× Genomics data, BLR showed an increase in phase block N50 with low switch-error rates. For TELL-Seq and stLFR linked reads, BLR generated longer or similar phase block lengths and low switch error rates compared to results presented in the original publications. In conclusion, BLR provides a flexible workflow for comprehensive haplotype analysis of linked reads from multiple platforms.

  • 43.
    Kang, Wenjing
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för molekylär biovetenskap, Wenner-Grens institut.
    microRNAs: from biogenesis to organismal tracing2020Doktorsavhandling, sammanläggning (Övrigt vetenskapligt)
    Abstract [en]

    MicroRNAs (miRNAs) are short noncoding RNAs of around 22 nucleotides in length, which help to shape the expression of most mRNAs. Perturbation of miRNA expression has revealed a variety of defects in development, cell specification, physiology and behavior. This thesis focuses on two topics of miRNA: identification of structural features that influence miRNA biogenesis (Paper I) and application of taxonomical marker miRNAs to resolve organismal origin of samples (Paper II and III).

    The current model of miRNA hairpin biogenesis has limited information content and appears to be incomplete. In paper I, we apply a novel high-throughput screening method to profile the optimal structure of miRNA hairpins for efficient and precise miRNA biogenesis. The optimal structure consists of tight and loose local structures across the hairpin, which reflects the constraints of biogenesis proteins. We find that miRNA hairpins with stable lower basal stem are more efficiently processed and have a higher expression level in tissues of 20 animal species. We address that the structural features - which have been largely neglected in the current model - are in fact as important as the well-known sequence motifs.

    New miRNAs are continuously added over evolutionary time and are rarely secondarily lost, making them ideal taxonomical markers. In paper II, we demonstrate as a proof-of-principle that miRNAs can be used to trace biological sample back to the lineage or even species of origin. Based on the marker miRNAs, we develop miRTrace, the first software to accurately trace miRNA sequences back to their taxonomical origin. The method can sensitively identify the origin of single cells and detect parasitic nematode RNA in mammalian host blood sample. In paper III, we apply miRNA tracing to address a controversial question about the origin of the exogenous plant miRNAs (xenomiRs) found in human samples, and which have been proposed to regulate human gene expression. Our computational and experimental results provide evidence that xenomiRs are derived from technical artifacts rather than dietary intake.

    Ladda ner fulltext (pdf)
    microRNAs: from biogenesis to organismal tracing
    Ladda ner (jpg)
    presentationsbild
    Ladda ner (pdf)
    Errata
  • 44. Kang, Yanlei
    et al.
    Elofsson, Arne
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik. Stockholms universitet, Science for Life Laboratory (SciLifeLab).
    Jiang, Yunliang
    Huang, Weihong
    Yu, Minzhe
    Li, Zhong
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik. Stockholms universitet, Science for Life Laboratory (SciLifeLab). Huzhou University, China; Zhejiang Sci-Tech University, China.
    AFTGAN: prediction of multi-type PPI based on attention free transformer and graph attention network2023Ingår i: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 39, nr 2, artikel-id btad052Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Motivation: Protein–protein interaction (PPI) networks and transcriptional regulatory networks are critical in regulating cells and their signaling. A thorough understanding of PPIs can provide more insights into cellular physiology at normal and disease states. Although numerous methods have been proposed to predict PPIs, it is still challenging for interaction prediction between unknown proteins. In this study, a novel neural network named AFTGAN was constructed to predict multi-type PPIs. Regarding feature input, ESM-1b embedding containing much biological information for proteins was added as a protein sequence feature besides amino acid co-occurrence similarity and one-hot coding. An ensemble network was also constructed based on a transformer encoder containing an AFT module (performing the weight operation on vital protein sequence feature information) and graph attention network (extracting the relational features of protein pairs) for the part of the network framework.

    Results: The experimental results showed that the Micro-F1 of the AFTGAN based on three partitioning schemes (BFS, DFS and the random mode) on the SHS27K and SHS148K datasets was 0.685, 0.711 and 0.867, as well as 0.745, 0.819 and 0.920, respectively, all higher than that of other popular methods. In addition, the experimental comparisons confirmed the performance superiority of the proposed model for predicting PPIs of unknown proteins on the STRING dataset.

    Availability and implementation: The source code is publicly available at https://github.com/1075793472/AFTGAN.

    Supplementary information: Supplementary data are available at Bioinformatics online.

  • 45.
    Karlsson, Dave
    Stockholms universitet, Naturvetenskapliga fakulteten, Zoologiska institutionen.
    Charting Insect Diversity2024Doktorsavhandling, sammanläggning (Övrigt vetenskapligt)
    Abstract [en]

    Background: Despite Sweden's rich legacy in entomology, a significant portion of its insect fauna remains poorly studied. Addressing this and other biodiversity knowledge gaps, the Swedish government unveiled the Swedish Taxonomy Initiative (STI) in 2002, with the ambitious goal of documenting and scientifically describing all multicellular species in the country. One of the largest projects funded by STI is the Swedish Malaise Trap Project (SMTP). The SMTP project, the data resulting from it, and the analyses of that data constitute the core of the current thesis.

    Methods and Results: The SMTP deployed 73 Malaise traps across 55 diverse habitats from 2003 to 2009, capturing an estimated 20 million insects. The catch has been sorted to over 300 taxonomic fractions suitable for further processing by taxonomic experts. The sorted material has been studied by over 100 taxonomists, identifying 4,000 species in about 1% of the total material. A third of these were previously unrecorded in Sweden, including nearly 700 potentially new to science. The SMTP represents a significant community effort and we describe the history, organization, logistics and methodology of the SMTP project, with a focus on the lessons learned along the way and the optimized workflows that resulted in the end. The SMTP output was used to estimate the species richness and composition of the Swedish insect fauna. This included expert assessments, analysis of new species discovery rates, and statistical extrapolations from abundance and incidence data, including a novel non-parametric estimator. These methods converged on an estimate of 33,000 species, 26% of which were unknown at the inventory’s start, and 15% of which still await discovery. To improve the speed and accuracy of the analysis of Malaise trap samples, we introduced morphotype barcoding, combining manual sorting into morphospecies with individual DNA barcoding of representative specimens. Morphotype barcoding is shown to offer more accurate abundance estimates than metabarcoding. In contrast to metabarcoding, it also provides material that is directly suitable for enhancing barcode reference libraries. At the same time, it is shown to be significantly cheaper and require less consumables than megabarcoding (specimen-level barcoding of all specimens in the sample).

    Conclusion: The SMTP exemplifies the successful application of community science to biodiversity research, leveraging volunteer efforts alongside professional expertise, a model that has proven to be effective in gathering extensive biodiversity data. The thesis thus offers valuable insights into planning and executing large-scale biodiversity inventories. The analyses of SMTP data suggest that a significant portion of the diversity remains undiscovered or undocumented within one of Europe's most well-studied insect faunas. The thesis highlights critical taxonomic and ecological biases in our current understanding, evidenced by the predominance of Hymenoptera and Diptera species, and decomposers and parasitoids, among the newly discovered species. These findings are pivotal in reshaping our understanding of global biodiversity and the specific ecological roles of insects. The study also emphasizes the need for a more inclusive taxonomic scope in biodiversity inventories, a challenge heightened by the urgency suggested by recent reports of alarming global declines in insect populations.

    Ladda ner fulltext (pdf)
    Charting Insect Diversity
    Ladda ner (jpg)
    omslagsframsida
  • 46.
    Karlsson, Dave
    et al.
    Stockholms universitet, Naturvetenskapliga fakulteten, Zoologiska institutionen.
    Forshage, Mattias
    Holston, Kevin
    Ronquist, Fredrik
    The data of the Swedish Malaise Trap Project, a countrywide inventory of Sweden's insect fauna2020Ingår i: Biodiversity Data Journal, ISSN 1314-2836, E-ISSN 1314-2828, Vol. 8Artikel i tidskrift (Refereegranskat)
  • 47.
    Karlsson, Dave
    et al.
    Stockholms universitet, Naturvetenskapliga fakulteten, Zoologiska institutionen. Station Linné.
    Klinth, Mårten
    Havnås, Harald
    Johansson, Håkan
    Granqvist, Emma
    Iwaszkiewicz-Eggebrecht, Elzbieta
    Ronquist, Fredrik
    Morphotype barcoding: a hybrid approach combining morphospecies sorting with specimen-level barcoding of insects from Malaise trap sampleManuskript (preprint) (Övrigt vetenskapligt)
  • 48. Ke, Rongqin
    et al.
    Mignardi, Marco
    Hauling, Thomas
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik. Stockholms universitet, Science for Life Laboratory (SciLifeLab).
    Nilsson, Mats
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik. Stockholms universitet, Science for Life Laboratory (SciLifeLab).
    Fourth Generation of Next-Generation Sequencing Technologies: Promise and Consequences2016Ingår i: Human Mutation, ISSN 1059-7794, E-ISSN 1098-1004, Vol. 37, nr 12, s. 1363-1367Artikel, forskningsöversikt (Refereegranskat)
    Abstract [en]

    In this review, we discuss the emergence of the fourth-generation sequencing technologies that preserve the spatial coordinates of RNA and DNA sequences with up to subcellular resolution, thus enabling back mapping of sequencing reads to the original histological context. This information is used, for example, in two current large-scale projects that aim to unravel the function of the brain. Also in cancer research, fourth-generation sequencing has the potential to revolutionize the field. Cancer Research UK has named Mapping the molecular and cellular tumor microenvironment in order to define new targets for therapy and prognosis one of the grand challenges in tumor biology. We discuss the advantages of sequencing nucleic acids directly in fixed cells over traditional next-generation sequencing (NGS) methods, the limitations and challenges that these new methods have to face to become broadly applicable, and the impact that the information generated by the combination of in situ sequencing and NGS methods will have in research and diagnostics.

  • 49. Kenah, Eben
    et al.
    Britton, Tom
    Stockholms universitet, Naturvetenskapliga fakulteten, Matematiska institutionen.
    Halloran, M. Elizabeth
    Longini, Ira M.
    Molecular Infectious Disease Epidemiology: Survival Analysis and Algorithms Linking Phylogenies to Transmission Trees2016Ingår i: PloS Computational Biology, ISSN 1553-734X, E-ISSN 1553-7358, Vol. 12, nr 4Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Recent work has attempted to use whole-genome sequence data from pathogens to reconstruct the transmission trees linking infectors and infectees in outbreaks. However, transmission trees from one outbreak do not generalize to future outbreaks. Reconstruction of transmission trees is most useful to public health if it leads to generalizable scientific insights about disease transmission. In a survival analysis framework, estimation of transmission parameters is based on sums or averages over the possible transmission trees. A phylogeny can increase the precision of these estimates by providing partial information about who infected whom. The leaves of the phylogeny represent sampled pathogens, which have known hosts. The interior nodes represent common ancestors of sampled pathogens, which have unknown hosts. Starting from assumptions about disease biology and epidemiologic study design, we prove that there is a one-to-one correspondence between the possible assignments of interior node hosts and the transmission trees simultaneously consistent with the phylogeny and the epidemiologic data on person, place, and time. We develop algorithms to enumerate these transmission trees and show these can be used to calculate likelihoods that incorporate both epidemiologic data and a phylogeny. A simulation study confirms that this leads to more efficient estimates of hazard ratios for infectiousness and baseline hazards of infectious contact, and we use these methods to analyze data from a foot-and-mouth disease virus outbreak in the United Kingdom in 2001. These results demonstrate the importance of data on individuals who escape infection, which is often overlooked. The combination of survival analysis and algorithms linking phylogenies to transmission trees is a rigorous but flexible statistical foundation for molecular infectious disease epidemiology.

  • 50.
    Kim, Sea-Yong
    Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för ekologi, miljö och botanik.
    The neurotoxin β-N-methylamino-L-alanine (BMAA) and 2,4-diaminobutyric acid (DAB): possible risk of human exposure, and the effect and function in diatoms2022Doktorsavhandling, sammanläggning (Övrigt vetenskapligt)
    Abstract [en]

    The toxic secondary metabolites β-N-methylamino-L-alanine (BMAA) and 2,4-diaminobutyric acid (DAB) produced by phytoplankton groups such as cyanobacteria, diatoms and dinoflagellates are known to cause neurotoxicity in vertebrates. BMAA has been linked to development of the neurodegenerative diseases amyotrophic lateral sclerosis/Parkinsonism dementia complex (ALS/PDC) and Alzheimer's disease. Despite these risks, previous studies have focused mostly on food webs in aquatic ecosystems as a possible source of human exposure to BMAA and DAB. Moreover, most studies in regard to the producer of BMAA and DAB are biased towards cyanobacteria.

    The first aim of this thesis was to investigate the possible risk of human exposure to BMAA via the agro-aqua cycle that artificially interconnects agriculture and aquaculture. Two groups of commercial chickens, fed on either standard feed or standard feed mixed with blue mussel meat, were investigated. The results show that BMAA can be transferred to and accumulated in the chickens through the mixed fodder. It has been suggested that the consumption of chicken may cause a risk of human exposure to BMAA if the chickens are fed with the fodder mixed with mussel meat (Paper I).

    The second aim was to assess the effect of biotic stresses (i.e. predation, competition) as possible causative factors to regulate the production of BMAA and/or DAB in diatoms, and assess the toxic effect of BMAA and/or DAB on predator and competitor (if specific production patterns occur for either toxin). The production of DAB was specially regulated only in the diatom T. pseudonana as responses to the predation and the competition. The toxic effect of DAB was significant on the population growth of the copepod Tigriopus sp. as predator, and the growth of cell numbers in T. pseudonana as competitor. However, given the environmental relevance of the DAB effect, the results suggest that DAB may play an important role in the defense mechanisms of the diatom T. pseudonana (Paper II and III).

    The last aim was to study the effect and function of BMAA in the diatom Phaeodactylum tricornutum. P. tricornutum was exposed to different concentrations of BMAA. The results showed concentration dependent responses to BMAA. The following were observed when the growth (i.e. cell number) of P. tricornutum was arrested due to exogenous BMAA; oxidative stress, reduced carbon fixation, increase in intracellular Chl a, alterations in GS-GOGAT, and suppressed urea cycle. The results suggest that BMAA represents a toxic secondary metabolite capable of controlling the growth of P. tricornutum via oxidative stress and alterations in the activity of photosynthesis and nitrogen metabolism (Paper IV).

    Ladda ner fulltext (pdf)
    The neurotoxin β-N-methylamino-L-alanine (BMAA) and 2,4-diaminobutyric acid (DAB)
    Ladda ner (jpg)
    presentationsbild
123 1 - 50 av 136
RefereraExporteraLänk till träfflistan
Permanent länk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf