Change search
Link to record
Permanent link

Direct link
Sherwood, Ellen
Publications (5 of 5) Show all publications
Barcala, M. E., van der Valk, T., Chen, Z., Funda, T., Chaudhary, R., Klingberg, A., . . . Wu, H. X. (2024). Whole-genome resequencing facilitates the development of a 50K single nucleotide polymorphism genotyping array for Scots pine (Pinus sylvestris L.) and its transferability to other pine species. The Plant Journal, 117(3), 944-955
Open this publication in new window or tab >>Whole-genome resequencing facilitates the development of a 50K single nucleotide polymorphism genotyping array for Scots pine (Pinus sylvestris L.) and its transferability to other pine species
Show others...
2024 (English)In: The Plant Journal, ISSN 0960-7412, E-ISSN 1365-313X, Vol. 117, no 3, p. 944-955Article in journal (Refereed) Published
Abstract [en]

Scots pine (Pinus sylvestris L.) is one of the most widespread and economically important conifer species in the world. Applications like genomic selection and association studies, which could help accelerate breeding cycles, are challenging in Scots pine because of its large and repetitive genome. For this reason, genotyping tools for conifer species, and in particular for Scots pine, are commonly based on transcribed regions of the genome. In this article, we present the Axiom Psyl50K array, the first single nucleotide polymorphism (SNP) genotyping array for Scots pine based on whole-genome resequencing, that represents both genic and intergenic regions. This array was designed following a two-step procedure: first, 192 trees were sequenced, and a 430K SNP screening array was constructed. Then, 480 samples, including haploid megagametophytes, full-sib family trios, breeding population, and range-wide individuals from across Eurasia were genotyped with the screening array. The best 50K SNPs were selected based on quality, replicability, distribution across the draft genome assembly, balance between genic and intergenic regions, and genotype–environment and genotype–phenotype associations. Of the final 49 877 probes tiled in the array, 20 372 (40.84%) occur inside gene models, while the rest lie in intergenic regions. We also show that the Psyl50K array can yield enough high-confidence SNPs for genetic studies in pine species from North America and Eurasia. This new genotyping tool will be a valuable resource for high-throughput fundamental and applied research of Scots pine and other pine species.

Keywords
genome resequencing, SNP array, Pinus sylvestris, pines, genomic selection, genome-wide association studies
National Category
Genetics and Genomics Forest Science
Identifiers
urn:nbn:se:su:diva-225081 (URN)10.1111/tpj.16535 (DOI)001103179900001 ()37947292 (PubMedID)2-s2.0-85176314433 (Scopus ID)
Available from: 2024-01-08 Created: 2024-01-08 Last updated: 2025-02-01Bibliographically approved
Alexeyenko, A., Nystedt, B., Vezzi, F., Sherwood, E., Ye, R., Knudsen, B., . . . Lundeberg, J. (2014). Efficient de novo assembly of large and complex genomes by massively parallel sequencing of Fosmid pools. BMC Genomics, 15, 439
Open this publication in new window or tab >>Efficient de novo assembly of large and complex genomes by massively parallel sequencing of Fosmid pools
Show others...
2014 (English)In: BMC Genomics, E-ISSN 1471-2164, Vol. 15, p. 439-Article in journal (Refereed) Published
Abstract [en]

Background: Sampling genomes with Fosmid vectors and sequencing of pooled Fosmid libraries on the Illumina platform for massive parallel sequencing is a novel and promising approach to optimizing the trade-off between sequencing costs and assembly quality. Results: In order to sequence the genome of Norway spruce, which is of great size and complexity, we developed and applied a new technology based on the massive production, sequencing, and assembly of Fosmid pools (FP). The spruce chromosomes were sampled with similar to 40,000 bp Fosmid inserts to obtain around two-fold genome coverage, in parallel with traditional whole genome shotgun sequencing (WGS) of haploid and diploid genomes. Compared to the WGS results, the contiguity and quality of the FP assemblies were high, and they allowed us to fill WGS gaps resulting from repeats, low coverage, and allelic differences. The FP contig sets were further merged with WGS data using a novel software package GAM-NGS. Conclusions: By exploiting FP technology, the first published assembly of a conifer genome was sequenced entirely with massively parallel sequencing. Here we provide a comprehensive report on the different features of the approach and the optimization of the process. We have made public the input data (FASTQ format) for the set of pools used in this study: ftp://congenie.org/congenie/Nystedt_2013/Assembly/ProcessedData/FosmidPools/.(alternatively accessible via http://congenie.org/downloads).The software used for running the assembly process is available at http://research.scilifelab.se/andrej_alexeyenko/downloads/fpools/.

National Category
Bioinformatics and Computational Biology Genetics and Genomics
Identifiers
urn:nbn:se:su:diva-106343 (URN)10.1186/1471-2164-15-439 (DOI)000338258700001 ()
Note

AuthorCount:11;

Available from: 2014-08-08 Created: 2014-08-04 Last updated: 2025-11-05Bibliographically approved
Brodin, J., Mild, M., Hedskog, C., Sherwood, E., Leitner, T., Andersson, B. & Albert, J. (2013). PCR-Induced Transitions Are the Major Source of Error in Cleaned Ultra-Deep Pyrosequencing Data. PLOS ONE, 8(7), Article ID e70388.
Open this publication in new window or tab >>PCR-Induced Transitions Are the Major Source of Error in Cleaned Ultra-Deep Pyrosequencing Data
Show others...
2013 (English)In: PLOS ONE, E-ISSN 1932-6203, Vol. 8, no 7, article id e70388Article in journal (Refereed) Published
Abstract [en]

Background: Ultra-deep pyrosequencing (UDPS) is used to identify rare sequence variants. The sequence depth is influenced by several factors including the error frequency of PCR and UDPS. This study investigated the characteristics and source of errors in raw and cleaned UDPS data. Results: UDPS of a 167-nucleotide fragment of the HIV-1 SG3Denv plasmid was performed on the Roche/454 platform. The plasmid was diluted to one copy, PCR amplified and subjected to bidirectional UDPS on three occasions. The dataset consisted of 47,693 UDPS reads. Raw UDPS data had an average error frequency of 0.30% per nucleotide site. Most errors were insertions and deletions in homopolymeric regions. We used a cleaning strategy that removed almost all indel errors, but had little effect on substitution errors, which reduced the error frequency to 0.056% per nucleotide. In cleaned data the error frequency was similar in homopolymeric and non-homopolymeric regions, but varied considerably across sites. These site-specific error frequencies were moderately, but still significantly, correlated between runs (r = 0.15-0.65) and between forward and reverse sequencing directions within runs (r = 0.33-0.65). Furthermore, transition errors were 48-times more common than transversion errors (0.052% vs. 0.001%; p<0.0001). Collectively the results indicate that a considerable proportion of the sequencing errors that remained after data cleaning were generated during the PCR that preceded UDPS. Conclusions: A majority of the sequencing errors that remained after data cleaning were introduced by PCR prior to sequencing, which means that they will be independent of platform used for next-generation sequencing. The transition vs. transversion error bias in cleaned UDPS data will influence the detection limits of rare mutations and sequence variants.

National Category
Biological Sciences
Identifiers
urn:nbn:se:su:diva-96125 (URN)10.1371/journal.pone.0070388 (DOI)000325211000207 ()
Note

AuthorCount:7;

Available from: 2013-11-13 Created: 2013-11-11 Last updated: 2022-03-23Bibliographically approved
Nystedt, B., Sherwood, E., Arvestad, L. & Jansson, S. (2013). The Norway spruce genome sequence and conifer genome evolution. Nature, 497(7451), 579-584
Open this publication in new window or tab >>The Norway spruce genome sequence and conifer genome evolution
2013 (English)In: Nature, ISSN 0028-0836, E-ISSN 1476-4687, Vol. 497, no 7451, p. 579-584Article in journal (Refereed) Published
Abstract [en]

Conifers have dominated forests for more than 200 million years and are of huge ecological and economic importance. Here we present the draft assembly of the 20-gigabase genome of Norway spruce (Picea abies), the first available for any gymnosperm. The number of well-supported genes (28,354) is similar to the >100 times smaller genome of Arabidopsis thaliana, and there is no evidence of a recent whole-genome duplication in the gymnosperm lineage. Instead, the large genome size seems to result from the slow and steady accumulation of a diverse set of long-terminal repeat transposable elements, possibly owing to the lack of an efficient elimination mechanism. Comparative sequencing of Pinus sylvestris, Abies sibirica, Juniperus communis, Taxus baccata and Gnetum gnemon reveals that the transposable element diversity is shared among extant conifers. Expression of 24-nucleotide small RNAs, previously implicated in transposable element silencing, is tissue-specific and much lower than in other plants. We further identify numerous long (>10,000 base pairs) introns, gene-like fragments, uncharacterized long non-coding RNAs and short RNAs. This opens up new genomic avenues for conifer forestry and breeding.

Keywords
Plant sciences
National Category
Biological Sciences
Identifiers
urn:nbn:se:su:diva-91836 (URN)10.1038/nature12211 (DOI)000319556100035 ()
Funder
Knut and Alice Wallenberg FoundationSwedish Research CouncilSwedish Research Council FormasSwedish Foundation for Strategic Research Science for Life Laboratory - a national resource center for high-throughput molecular bioscience
Available from: 2013-07-05 Created: 2013-07-04 Last updated: 2022-03-23Bibliographically approved
Stranneheim, H., Werne, B., Sherwood, E. & Lundeberg, J. (2011). Scalable Transcriptome Preparation for Massive Parallel Sequencing. PLOS ONE, 6(7), e21910
Open this publication in new window or tab >>Scalable Transcriptome Preparation for Massive Parallel Sequencing
2011 (English)In: PLOS ONE, E-ISSN 1932-6203, Vol. 6, no 7, p. e21910-Article in journal (Refereed) Published
Abstract [en]

Background: The tremendous output of massive parallel sequencing technologies requires automated robust and scalable sample preparation methods to fully exploit the new sequence capacity. Methodology: In this study, a method for automated library preparation of RNA prior to massively parallel sequencing is presented. The automated protocol uses precipitation onto carboxylic acid paramagnetic beads for purification and size selection of both RNA and DNA. The automated sample preparation was compared to the standard manual sample preparation. Conclusion/Significance: The automated procedure was used to generate libraries for gene expression profiling on the Illumina HiSeq 2000 platform with the capacity of 12 samples per preparation with a significantly improved throughput compared to the standard manual preparation. The data analysis shows consistent gene expression profiles in terms of sensitivity and quantification of gene expression between the two library preparation methods.

National Category
Natural Sciences
Identifiers
urn:nbn:se:su:diva-66578 (URN)10.1371/journal.pone.0021910 (DOI)000292655400026 ()
Note
authorCount :4Available from: 2011-12-21 Created: 2011-12-20 Last updated: 2022-02-24Bibliographically approved
Organisations

Search in DiVA

Show all publications