Ändra sökning
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
GenFamClust: an accurate, synteny-aware and reliable homology inference algorithm
Stockholms universitet, Naturvetenskapliga fakulteten, Numerisk analys och datalogi (NADA). Stockholms universitet, Science for Life Laboratory (SciLifeLab). Swedish e-Science Research Centre, Sweden.
Antal upphovsmän: 32016 (Engelska)Ingår i: BMC Evolutionary Biology, ISSN 1471-2148, E-ISSN 1471-2148, Vol. 16, artikel-id 120Artikel i tidskrift (Refereegranskat) Published
Abstract [en]

Background: Homology inference is pivotal to evolutionary biology and is primarily based on significant sequence similarity, which, in general, is a good indicator of homology. Algorithms have also been designed to utilize conservation in gene order as an indication of homologous regions. We have developed GenFamClust, a method based on quantification of both gene order conservation and sequence similarity. Results: In this study, we validate GenFamClust by comparing it to well known homology inference algorithms on a synthetic dataset. We applied several popular clustering algorithms on homologs inferred by GenFamClust and other algorithms on a metazoan dataset and studied the outcomes. Accuracy, similarity, dependence, and other characteristics were investigated for gene families yielded by the clustering algorithms. GenFamClust was also applied to genes from a set of complete fungal genomes and gene families were inferred using clustering. The resulting gene families were compared with a manually curated gold standard of pillars from the Yeast Gene Order Browser. We found that the gene-order component of GenFamClust is simple, yet biologically realistic, and captures local synteny information for homologs. Conclusions: The study shows that GenFamClust is a more accurate, informed, and comprehensive pipeline to infer homologs and gene families than other commonly used homology and gene-family inference methods.

Ort, förlag, år, upplaga, sidor
2016. Vol. 16, artikel-id 120
Nyckelord [en]
Homology inference, Gene synteny, Gene similarity, Gene family, Clustering, Gene order conservation
Nationell ämneskategori
Biologiska vetenskaper
Identifikatorer
URN: urn:nbn:se:su:diva-131920DOI: 10.1186/s12862-016-0684-2ISI: 000377161400002PubMedID: 27260514OAI: oai:DiVA.org:su-131920DiVA, id: diva2:946946
Tillgänglig från: 2016-07-06 Skapad: 2016-07-04 Senast uppdaterad: 2017-11-28Bibliografiskt granskad

Open Access i DiVA

Fulltext saknas i DiVA

Övriga länkar

Förlagets fulltextPubMed

Sök vidare i DiVA

Av författaren/redaktören
Arvestad, Lars
Av organisationen
Numerisk analys och datalogi (NADA)Science for Life Laboratory (SciLifeLab)
I samma tidskrift
BMC Evolutionary Biology
Biologiska vetenskaper

Sök vidare utanför DiVA

GoogleGoogle Scholar

doi
pubmed
urn-nbn

Altmetricpoäng

doi
pubmed
urn-nbn
Totalt: 270 träffar
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf