Change search
Link to record
Permanent link

Direct link
Publications (10 of 22) Show all publications
Sullivan, A. R., Eldfjell, Y., Schiffthaler, B., Delhomme, N., Asp, T., Hebelstrup, K. H., . . . Wang, X.-R. (2020). The Mitogenome of Norway Spruce and a Reappraisal of Mitochondrial Recombination in Plants. Genome Biology and Evolution, 12(1), 3586-3598
Open this publication in new window or tab >>The Mitogenome of Norway Spruce and a Reappraisal of Mitochondrial Recombination in Plants
Show others...
2020 (English)In: Genome Biology and Evolution, E-ISSN 1759-6653, Vol. 12, no 1, p. 3586-3598Article in journal (Refereed) Published
Abstract [en]

Plant mitogenomes can be difficult to assemble because they are structurally dynamic and prone to intergenomic DNA transfers, leading to the unusual situation where an organelle genome is far outnumbered by its nuclear counterparts. As a result, comparative mitogenome studies are in their infancy and some key aspects of genome evolution are still known mainly from pregenomic, qualitative methods. To help address these limitations, we combined machine learning and in silico enrichment of mitochondrial-like long reads to assemble the bacterial-sized mitogenome of Norway spruce (Pinaceae: Picea abies). We conducted comparative analyses of repeat abundance, intergenomic transfers, substitution and rearrangement rates, and estimated repeat-by-repeat homologous recombination rates. Prompted by our discovery of highly recombinogenic small repeats in P. abies, we assessed the genomic support for the prevailing hypothesis that intramolecular recombination is predominantly driven by repeat length, with larger repeats facilitating DNA exchange more readily. Overall, we found mixed support for this view: Recombination dynamics were heterogeneous across vascular plants and highly active small repeats (ca. 200 bp) were present in about one-third of studied mitogenomes. As in previous studies, we did not observe any robust relationships among commonly studied genome attributes, but we identify variation in recombination rates as a underinvestigated source of plant mitogenome diversity.

Keywords
mitogenome, repeats, recombination, rearrangement rates, structural variation
National Category
Biological Sciences
Identifiers
urn:nbn:se:su:diva-181386 (URN)10.1093/gbe/evz263 (DOI)000522860800005 ()31774499 (PubMedID)
Available from: 2020-05-07 Created: 2020-05-07 Last updated: 2024-07-04Bibliographically approved
Klinter, S., Bulone, V. & Arvestad, L. (2019). Diversity and evolution of chitin synthases in oomycetes (Straminipila: Oomycota). Molecular Phylogenetics and Evolution, 139, Article ID 106558.
Open this publication in new window or tab >>Diversity and evolution of chitin synthases in oomycetes (Straminipila: Oomycota)
2019 (English)In: Molecular Phylogenetics and Evolution, ISSN 1055-7903, E-ISSN 1095-9513, Vol. 139, article id 106558Article in journal (Refereed) Published
Abstract [en]

The oomycetes are filamentous eukaryotic microorganisms, distinct from true fungi, many of which act as crop or fish pathogens that cause devastating losses in agriculture and aquaculture. Chitin is present in all true fungi, but it occurs in only small amounts in some Saprolegniomycetes and it is absent in Peronosporomycetes. However, the growth of several oomycetes is severely impacted by competitive chitin synthase (CHS) inhibitors. Here, we shed light on the diversity, evolution and function of oomycete CHS proteins. We show by phylogenetic analysis of 93 putative CHSs from 48 highly diverse oomycetes, including the early diverging Ewychasma dicksonii, that all available oomycete genomes contain at least one putative CHS gene. All gene products contain conserved CHS motifs essential for enzymatic activity and form two Peronosporomycete-specific and six Saprolegniale-specific clades. Proteins of all clades, except one, contain an N-terminal microtubule interacting and trafficking (MIT) domain as predicted by protein domain databases or manual analysis, which is supported by homology modelling and comparison of conserved structural features from sequence logos. We identified at least three groups of CHSs conserved among all oomycete lineages and used phylogenetic reconciliation analysis to infer the dynamic evolution of CHSs in oomycetes. The evolutionary aspects of CHS diversity in modern-day oomycetes are discussed. In addition, we observed hyphal tip rupture in Phytophthora infestans upon treatment with the CHS inhibitor nikkomycin Z. Combining data on phylogeny, gene expression, and response to CHS inhibitors, we propose the association of different CHS clades with certain developmental stages.

Keywords
Chitin synthase, Evolution, Growth inhibition, Microtubule interacting and trafficking (MIT) domain, Oomycete, Phylogeny
National Category
Biological Sciences
Identifiers
urn:nbn:se:su:diva-175032 (URN)10.1016/j.ympev.2019.106558 (DOI)000485041900042 ()31288106 (PubMedID)
Available from: 2019-10-30 Created: 2019-10-30 Last updated: 2022-02-26Bibliographically approved
Arvestad, L. (2018). alv: a console-based viewer for molecular sequence alignments. Journal of Open Source Software, 3(31), Article ID 955.
Open this publication in new window or tab >>alv: a console-based viewer for molecular sequence alignments
2018 (English)In: Journal of Open Source Software, E-ISSN 2475-9066, Vol. 3, no 31, article id 955Article in journal (Refereed) Published
National Category
Bioinformatics (Computational Biology)
Research subject
Computer Science; Molecular Biology
Identifiers
urn:nbn:se:su:diva-164794 (URN)10.21105/joss.00955 (DOI)
Available from: 2019-01-18 Created: 2019-01-18 Last updated: 2022-09-15Bibliographically approved
Duchemin, W., Gence, G., Chifolleau, A.-M. A., Arvestad, L., Bansal, M. S., Berry, V., . . . Daubin, V. (2018). RecPhyloXML: a format for reconciled gene trees. Bioinformatics, 34(21), 3646-3652
Open this publication in new window or tab >>RecPhyloXML: a format for reconciled gene trees
Show others...
2018 (English)In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 34, no 21, p. 3646-3652Article in journal (Refereed) Published
Abstract [en]

Motivation: A reconciliation is an annotation of the nodes of a gene tree with evolutionary events-for example, speciation, gene duplication, transfer, loss, etc. -along with a mapping onto a species tree. Many algorithms and software produce or use reconciliations but often using different reconciliation formats, regarding the type of events considered or whether the species tree is dated or not. This complicates the comparison and communication between different programs. Results: Here, we gather a consortium of software developers in gene tree species tree reconciliation to propose and endorse a format that aims to promote an integrative-albeit flexible-specification of phylogenetic reconciliations. This format, named recPhyloXML, is accompanied by several tools such as a reconciled tree visualizer and conversion utilities.

National Category
Biological Sciences Environmental Biotechnology Computer and Information Sciences Mathematics
Identifiers
urn:nbn:se:su:diva-162980 (URN)10.1093/bioinformatics/bty389 (DOI)000450038900007 ()29762653 (PubMedID)
Available from: 2018-12-13 Created: 2018-12-13 Last updated: 2022-03-23Bibliographically approved
Ali, R. H., Bark, M., Miró, J., Muhammad, S. A., Sjöstrand, J., Zubair, S. M., . . . Arvestad, L. (2017). VMCMC: a graphical and statistical analysis tool for Markov chain Monte Carlo traces. BMC Bioinformatics, 18, Article ID 97.
Open this publication in new window or tab >>VMCMC: a graphical and statistical analysis tool for Markov chain Monte Carlo traces
Show others...
2017 (English)In: BMC Bioinformatics, E-ISSN 1471-2105, Vol. 18, article id 97Article in journal (Refereed) Published
Abstract [en]

Background: MCMC-based methods are important for Bayesian inference of phylogeny and related parameters. Although being computationally expensive, MCMC yields estimates of posterior distributions that are useful for estimating parameter values and are easy to use in subsequent analysis. There are, however, sometimes practical difficulties with MCMC, relating to convergence assessment and determining burn-in, especially in large-scale analyses. Currently, multiple software are required to perform, e.g., convergence, mixing and interactive exploration of both continuous and tree parameters.

Results: We have written a software called VMCMC to simplify post-processing of MCMC traces with, for example, automatic burn-in estimation. VMCMC can also be used both as a GUI-based application, supporting interactive exploration, and as a command-line tool suitable for automated pipelines.

Conclusions: VMCMC is a free software available under the New BSD License. Executable jar files, tutorial manual and source code can be downloaded from https://bitbucket. org/rhali/visualmcmc/.

Keywords
Convergence, Markov chain Monte Carlo, Metropolis-Hastings, Phylogenetics, Software, Visualization
National Category
Biological Sciences Environmental Biotechnology
Identifiers
urn:nbn:se:su:diva-142512 (URN)10.1186/s12859-017-1505-3 (DOI)000397489700003 ()28187712 (PubMedID)
Available from: 2017-05-10 Created: 2017-05-10 Last updated: 2024-01-17Bibliographically approved
Sahlin, K., Chikhi, R. & Arvestad, L. (2016). Assembly scaffolding with PE-contaminated mate-pair libraries. Bioinformatics, 32(13), 1925-1932
Open this publication in new window or tab >>Assembly scaffolding with PE-contaminated mate-pair libraries
2016 (English)In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 32, no 13, p. 1925-1932Article in journal (Refereed) Published
Abstract [en]

Motivation: Scaffolding is often an essential step in a genome assembly process, in which contigs are ordered and oriented using read pairs from a combination of paired-end libraries and longer-range mate-pair libraries. Although a simple idea, scaffolding is unfortunately hard to get right in practice. One source of problems is so-called PE-contamination in mate-pair libraries, in which a non-negligible fraction of the read pairs get the wrong orientation and a much smaller insert size than what is expected. This contamination has been discussed before, in relation to integrated scaffolders, but solutions rely on the orientation being observable, e.g. by finding the junction adapter sequence in the reads. This is not always possible, making orientation and insert size of a read pair stochastic. To our knowledge, there is neither previous work on modeling PE-contamination, nor a study on the effect PE-contamination has on scaffolding quality. Results: We have addressed PE-contamination in an update to our scaffolder BESST. We formulate the problem as an integer linear program which is solved using an efficient heuristic. The new method shows significant improvement over both integrated and stand-alone scaffolders in our experiments. The impact of modeling PE-contamination is quantified by comparing with the previous BESST model. We also show how other scaffolders are vulnerable to PE-contaminated libraries, resulting in an increased number of misassemblies, more conservative scaffolding and inflated assembly sizes.

National Category
Biological Sciences Environmental Biotechnology Computer and Information Sciences Mathematics
Identifiers
urn:nbn:se:su:diva-132540 (URN)10.1093/bioinformatics/btw064 (DOI)000379761500002 ()27153683 (PubMedID)
Available from: 2016-08-25 Created: 2016-08-15 Last updated: 2022-03-23Bibliographically approved
Ali, R. H., Muhammad, S. A. & Arvestad, L. (2016). GenFamClust: an accurate, synteny-aware and reliable homology inference algorithm. BMC Evolutionary Biology, 16, Article ID 120.
Open this publication in new window or tab >>GenFamClust: an accurate, synteny-aware and reliable homology inference algorithm
2016 (English)In: BMC Evolutionary Biology, E-ISSN 1471-2148, Vol. 16, article id 120Article in journal (Refereed) Published
Abstract [en]

Background: Homology inference is pivotal to evolutionary biology and is primarily based on significant sequence similarity, which, in general, is a good indicator of homology. Algorithms have also been designed to utilize conservation in gene order as an indication of homologous regions. We have developed GenFamClust, a method based on quantification of both gene order conservation and sequence similarity. Results: In this study, we validate GenFamClust by comparing it to well known homology inference algorithms on a synthetic dataset. We applied several popular clustering algorithms on homologs inferred by GenFamClust and other algorithms on a metazoan dataset and studied the outcomes. Accuracy, similarity, dependence, and other characteristics were investigated for gene families yielded by the clustering algorithms. GenFamClust was also applied to genes from a set of complete fungal genomes and gene families were inferred using clustering. The resulting gene families were compared with a manually curated gold standard of pillars from the Yeast Gene Order Browser. We found that the gene-order component of GenFamClust is simple, yet biologically realistic, and captures local synteny information for homologs. Conclusions: The study shows that GenFamClust is a more accurate, informed, and comprehensive pipeline to infer homologs and gene families than other commonly used homology and gene-family inference methods.

Keywords
Homology inference, Gene synteny, Gene similarity, Gene family, Clustering, Gene order conservation
National Category
Biological Sciences
Identifiers
urn:nbn:se:su:diva-131920 (URN)10.1186/s12862-016-0684-2 (DOI)000377161400002 ()27260514 (PubMedID)
Available from: 2016-07-06 Created: 2016-07-04 Last updated: 2024-01-17Bibliographically approved
Khan, M. A., Mahmudi, O., Ullah, I., Arvestad, L. & Lagergren, J. (2016). Probabilistic inference of lateral gene transfer events. Paper presented at 14th Annual Research in Computational Molecular Biology (RECOMB), Montreal, Canada, 11-14 October 2016. BMC Bioinformatics, 17(Suppl 14), Article ID 431.
Open this publication in new window or tab >>Probabilistic inference of lateral gene transfer events
Show others...
2016 (English)In: BMC Bioinformatics, E-ISSN 1471-2105, Vol. 17, no Suppl 14, article id 431Article in journal (Refereed) Published
Abstract [en]

Background: Lateral gene transfer (LGT) is an evolutionary process that has an important role in biology. It challenges the traditional binary tree-like evolution of species and is attracting increasing attention of the molecular biologists due to its involvement in antibiotic resistance. A number of attempts have been made to model LGT in the presence of gene duplication and loss, but reliably placing LGT events in the species tree has remained a challenge.

Results: In this paper, we propose probabilistic methods that samples reconciliations of the gene tree with a dated species tree and computes maximum a posteriori probabilities. The MCMC-based method uses the probabilistic model DLTRS, that integrates LGT, gene duplication, gene loss, and sequence evolution under a relaxed molecular clock for substitution rates. We can estimate posterior distributions on gene trees and, in contrast to previous work, the actual placement of potential LGT, which can be used to, e.g., identify highways of LGT.

Conclusions: Based on a simulation study, we conclude that the method is able to infer the true LGT events on gene tree and reconcile it to the correct edges on the species tree in most cases. Applied to two biological datasets, containing gene families from Cyanobacteria and Molicutes, we find potential LGTs highways that corroborate other studies as well as previously undetected examples.

Keywords
Evolution, Bayesian inference, Phylogeny, Lateral gene transfer
National Category
Biological Sciences Environmental Biotechnology
Identifiers
urn:nbn:se:su:diva-140273 (URN)10.1186/s12859-016-1268-2 (DOI)000392515100009 ()28185583 (PubMedID)
Conference
14th Annual Research in Computational Molecular Biology (RECOMB), Montreal, Canada, 11-14 October 2016
Available from: 2017-03-22 Created: 2017-03-22 Last updated: 2024-01-17Bibliographically approved
Mahmudi, O., Sennblad, B., Arvestad, L., Nowick, K. & Lagergren, J. (2015). Gene-pseudogene evolution: a probabilistic approach. Paper presented at 13th Annual Research in Computational Molecular Biology (RECOMB) Satellite Workshop on Comparative Genomics, Frankfurt, Germany, October 04-07, 2015. BMC Genomics, 16, Article ID S12.
Open this publication in new window or tab >>Gene-pseudogene evolution: a probabilistic approach
Show others...
2015 (English)In: BMC Genomics, E-ISSN 1471-2164, Vol. 16, article id S12Article in journal (Refereed) Published
Abstract [en]

Over the last decade, methods have been developed for the reconstruction of gene trees that take into account the species tree. Many of these methods have been based on the probabilistic duplication-loss model, which describes how a gene-tree evolves over a species-tree with respect to duplication and losses, as well as extension of this model, e.g., the DLRS (Duplication, Loss, Rate and Sequence evolution) model that also includes sequence evolution under relaxed molecular clock. A disjoint, almost as recent, and very important line of research has been focused on non protein-coding, but yet, functional DNA. For instance, DNA sequences being pseudogenes in the sense that they are not translated, may still be transcribed and the thereby produced RNA may be functional. We extend the DLRS model by including pseudogenization events and devise an MCMC framework for analyzing extended gene families consisting of genes and pseudogenes with respect to this model, i.e., reconstructing gene-trees and identifying pseudogenization events in the reconstructed gene-trees. By applying the MCMC framework to biologically realistic synthetic data, we show that gene-trees as well as pseudogenization points can be inferred well. We also apply our MCMC framework to extended gene families belonging to the Olfactory Receptor and Zinc Finger superfamilies. The analysis indicate that both these super families contains very old pseudogenes, perhaps so old that it is reasonable to suspect that some are functional. In our analysis, the sub families of the Olfactory Receptors contains only lineage specific pseudogenes, while the sub families of the Zinc Fingers contains pseudogene lineages common to several species.

National Category
Environmental Biotechnology Biological Sciences
Identifiers
urn:nbn:se:su:diva-132028 (URN)10.1186/1471-2164-16-S10-S12 (DOI)000377308200012 ()26449131 (PubMedID)
Conference
13th Annual Research in Computational Molecular Biology (RECOMB) Satellite Workshop on Comparative Genomics, Frankfurt, Germany, October 04-07, 2015
Available from: 2016-08-15 Created: 2016-07-05 Last updated: 2024-01-17Bibliographically approved
Sjöstrand, J., Tofigh, A., Daubin, V., Arvestad, L., Sennblad, B. & Lagergren, J. (2014). A Bayesian Method for Analyzing Lateral Gene Transfer. Systematic Biology, 63(3), 409-420
Open this publication in new window or tab >>A Bayesian Method for Analyzing Lateral Gene Transfer
Show others...
2014 (English)In: Systematic Biology, ISSN 1063-5157, E-ISSN 1076-836X, Vol. 63, no 3, p. 409-420Article in journal (Refereed) Published
Abstract [en]

Lateral gene transfer (LGT)uwhich transfers DNA between two non-vertically related individuals belonging to the same or different speciesuis recognized as a major force in prokaryotic evolution, and evidence of its impact on eukaryotic evolution is ever increasing. LGT has attracted much public attention for its potential to transfer pathogenic elements and antibiotic resistance in bacteria, and to transfer pesticide resistance from genetically modified crops to other plants. In a wider perspective, there is a growing body of studies highlighting the role of LGT in enabling organisms to occupy new niches or adapt to environmental changes. The challenge LGT poses to the standard tree-based conception of evolution is also being debated. Studies of LGT have, however, been severely limited by a lack of computational tools. The best currently available LGT algorithms are parsimony-based phylogenetic methods, which require a pre-computed gene tree and cannot choose between sometimes wildly differing most parsimonious solutions. Moreover, in many studies, simple heuristics are applied that can only handle putative orthologs and completely disregard gene duplications (GDs). Consequently, proposed LGT among specific gene families, and the rate of LGT in general, remain debated. We present a Bayesian Markov-chain Monte Carlo-based method that integrates GD, gene loss, LGT, and sequence evolution, and apply the method in a genome-wide analysis of two groups of bacteria: Mollicutes and Cyanobacteria. Our analyses show that although the LGT rate between distant species is high, the net combined rate of duplication and close-species LGT is on average higher. We also show that the common practice of disregarding reconcilability in gene tree inference overestimates the number of LGT and duplication events. [Bayesian; gene duplication; gene loss; horizontal gene transfer; lateral gene transfer; MCMC; phylogenetics.].

Keywords
Bayesian, gene duplication, gene loss, horizontal gene transfer, lateral gene transfer, MCMC, phylogenetics
National Category
Developmental Biology Computer Sciences
Research subject
Computer Science
Identifiers
urn:nbn:se:su:diva-104137 (URN)10.1093/sysbio/syu007 (DOI)000334752600010 ()
Note

AuthorCount:6;

Available from: 2014-06-04 Created: 2014-06-03 Last updated: 2022-02-23Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0001-5341-1733

Search in DiVA

Show all publications