Ändra sökning
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
De Novo Assembly of Two Swedish Genomes Reveals Missing Segments from the Human GRCh38 Reference and Improves Variant Calling of Population-Scale Sequencing Data
Stockholms universitet, Naturvetenskapliga fakulteten, Institutionen för biokemi och biofysik. Stockholms universitet, Science for Life Laboratory (SciLifeLab).
Visa övriga samt affilieringar
Antal upphovsmän: 122018 (Engelska)Ingår i: Genes, ISSN 2073-4425, E-ISSN 2073-4425, Vol. 9, nr 10, artikel-id 486Artikel i tidskrift (Refereegranskat) Published
Abstract [en]

The current human reference sequence (GRCh38) is a foundation for large-scale sequencing projects. However, recent studies have suggested that GRCh38 may be incomplete and give a suboptimal representation of specific population groups. Here, we performed a de novo assembly of two Swedish genomes that revealed over 10 Mb of sequences absent from the human GRCh38 reference in each individual. Around 6 Mb of these novel sequences (NS) are shared with a Chinese personal genome. The NS are highly repetitive, have an elevated GC-content, and are primarily located in centromeric or telomeric regions. Up to 1 Mb of NS can be assigned to chromosome Y, and large segments are also missing from GRCh38 at chromosomes 14, 17, and 21. Inclusion of NS into the GRCh38 reference radically improves the alignment and variant calling from short-read whole-genome sequencing data at several genomic loci. A re-analysis of a Swedish population-scale sequencing project yields > 75,000 putative novel single nucleotide variants (SNVs) and removes > 10,000 false positive SNV calls per individual, some of which are located in protein coding regions. Our results highlight that the GRCh38 reference is not yet complete and demonstrate that personal genome assemblies from local populations can improve the analysis of short-read whole-genome sequencing data.

Ort, förlag, år, upplaga, sidor
2018. Vol. 9, nr 10, artikel-id 486
Nyckelord [en]
de novo assembly, SMRT sequencing, GRCh38, human reference genome, human whole-genome sequencing, population sequencing, Swedish population
Nationell ämneskategori
Biologiska vetenskaper
Identifikatorer
URN: urn:nbn:se:su:diva-162937DOI: 10.3390/genes9100486ISI: 000448656700024PubMedID: 30304863OAI: oai:DiVA.org:su-162937DiVA, id: diva2:1271569
Tillgänglig från: 2018-12-17 Skapad: 2018-12-17 Senast uppdaterad: 2018-12-17Bibliografiskt granskad

Open Access i DiVA

Fulltext saknas i DiVA

Övriga länkar

Förlagets fulltextPubMed

Sök vidare i DiVA

Av författaren/redaktören
Ameur, AdamMartin, MarcelHöijer, IdaVezzi, Francesco
Av organisationen
Institutionen för biokemi och biofysikScience for Life Laboratory (SciLifeLab)
I samma tidskrift
Genes
Biologiska vetenskaper

Sök vidare utanför DiVA

GoogleGoogle Scholar

doi
pubmed
urn-nbn

Altmetricpoäng

doi
pubmed
urn-nbn
Totalt: 256 träffar
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf