Endre søk
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Effective sequence similarity detection with strobemers
Stockholms universitet, Naturvetenskapliga fakulteten, Matematiska institutionen. Stockholms universitet, Science for Life Laboratory (SciLifeLab).ORCID-id: 0000-0001-7378-2320
2021 (engelsk)Inngår i: Genome Research, ISSN 1088-9051, E-ISSN 1549-5469, Vol. 31, nr 11, s. 2080-2094Artikkel i tidsskrift (Fagfellevurdert) Published
Abstract [en]

k-mer-based methods are widely used in bioinformatics for various types of sequence comparisons. However, a single mutation will mutate k consecutive k-mers and make most k-mer-based applications for sequence comparison sensitive to variable mutation rates. Many techniques have been studied to overcome this sensitivity, for example, spaced k-mers and k-mer permutation techniques, but these techniques do not handle indels well. For indels, pairs or groups of small k-mers are commonly used, but these methods first produce k-mer matches, and only in a second step, a pairing or grouping of k-mers is performed. Such techniques produce many redundant k-mer matches owing to the size of k Here, we propose strobemers as an alternative to k-mers for sequence comparison. Intuitively, strobemers consist of two or more linked shorter k-mers, where the combination of linked k-mers is decided by a hash function. We use simulated data to show that strobemers provide more evenly distributed sequence matches and are less sensitive to different mutation rates than k-mers and spaced k-mers. Strobemers also produce higher match coverage across sequences. We further implement a proof-of-concept sequence-matching tool StrobeMap and use synthetic and biological Oxford Nanopore sequencing data to show the utility of using strobemers for sequence comparison in different contexts such as sequence clustering and alignment scenarios.

sted, utgiver, år, opplag, sider
2021. Vol. 31, nr 11, s. 2080-2094
HSV kategori
Identifikatorer
URN: urn:nbn:se:su:diva-199170DOI: 10.1101/gr.275648.121ISI: 000713666300010PubMedID: 34667119OAI: oai:DiVA.org:su-199170DiVA, id: diva2:1614620
Tilgjengelig fra: 2021-11-26 Laget: 2021-11-26 Sist oppdatert: 2021-12-13bibliografisk kontrollert

Open Access i DiVA

Fulltekst mangler i DiVA

Andre lenker

Forlagets fulltekstPubMed

Person

Sahlin, Kristoffer

Søk i DiVA

Av forfatter/redaktør
Sahlin, Kristoffer
Av organisasjonen
I samme tidsskrift
Genome Research

Søk utenfor DiVA

GoogleGoogle Scholar

doi
pubmed
urn-nbn

Altmetric

doi
pubmed
urn-nbn
Totalt: 28 treff
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf