Ändra sökning
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Assembly scaffolding with PE-contaminated mate-pair libraries
Stockholms universitet, Science for Life Laboratory (SciLifeLab). Stockholms universitet, Naturvetenskapliga fakulteten, Numerisk analys och datalogi (NADA). Swedish e-Science Research Centre, Sweden.ORCID-id: 0000-0001-5341-1733
Antal upphovsmän: 32016 (Engelska)Ingår i: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 32, nr 13, s. 1925-1932Artikel i tidskrift (Refereegranskat) Published
Abstract [en]

Motivation: Scaffolding is often an essential step in a genome assembly process, in which contigs are ordered and oriented using read pairs from a combination of paired-end libraries and longer-range mate-pair libraries. Although a simple idea, scaffolding is unfortunately hard to get right in practice. One source of problems is so-called PE-contamination in mate-pair libraries, in which a non-negligible fraction of the read pairs get the wrong orientation and a much smaller insert size than what is expected. This contamination has been discussed before, in relation to integrated scaffolders, but solutions rely on the orientation being observable, e.g. by finding the junction adapter sequence in the reads. This is not always possible, making orientation and insert size of a read pair stochastic. To our knowledge, there is neither previous work on modeling PE-contamination, nor a study on the effect PE-contamination has on scaffolding quality. Results: We have addressed PE-contamination in an update to our scaffolder BESST. We formulate the problem as an integer linear program which is solved using an efficient heuristic. The new method shows significant improvement over both integrated and stand-alone scaffolders in our experiments. The impact of modeling PE-contamination is quantified by comparing with the previous BESST model. We also show how other scaffolders are vulnerable to PE-contaminated libraries, resulting in an increased number of misassemblies, more conservative scaffolding and inflated assembly sizes.

Ort, förlag, år, upplaga, sidor
2016. Vol. 32, nr 13, s. 1925-1932
Nationell ämneskategori
Biologiska vetenskaper Miljöbioteknik Data- och informationsvetenskap Matematik
Identifikatorer
URN: urn:nbn:se:su:diva-132540DOI: 10.1093/bioinformatics/btw064ISI: 000379761500002PubMedID: 27153683OAI: oai:DiVA.org:su-132540DiVA, id: diva2:955280
Tillgänglig från: 2016-08-25 Skapad: 2016-08-15 Senast uppdaterad: 2020-03-04Bibliografiskt granskad

Open Access i DiVA

Fulltext saknas i DiVA

Övriga länkar

Förlagets fulltextPubMed

Sök vidare i DiVA

Av författaren/redaktören
Arvestad, Lars
Av organisationen
Science for Life Laboratory (SciLifeLab)Numerisk analys och datalogi (NADA)
I samma tidskrift
Bioinformatics
Biologiska vetenskaperMiljöbioteknikData- och informationsvetenskapMatematik

Sök vidare utanför DiVA

GoogleGoogle Scholar

doi
pubmed
urn-nbn

Altmetricpoäng

doi
pubmed
urn-nbn
Totalt: 81 träffar
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf