Does genome size matter?: Comparative (meta)genomics to investigate the ecological meaning of genome size in aquatic prokaryotes
2024 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]
The Streamlining Theory explains that the success of certain prokaryotic species in marine environments might be linked to their highly reduced genomes and minimal nutritional requirements. Still, the ecological implications of genome size variability remain understudied. In this thesis, I provide novel insights to explore how we can use genome size information to understand different ecological patterns in prokaryotes.
The first part of my thesis explores patterns of genome size variability across ecosystems. In Chapter I we investigate prokaryotic genome size across 17,834 species-clusters (ANI > 95%) retrieved from three major biomes (host-associated, terrestrial and aquatic), and 8,267 species-clusters obtained from laboratory-grown isolates. We found that host-associated and aquatic prokaryotes (averaging 3.0 Mbp and 3.1 Mbp, respectively) hold reduced genome sizes when compared to those retrieved from soils (averaging 3.7 Mbp) and laboratory-grown isolates (averaging 4.3 Mbp). Moreover, only a minority of the species-clusters had been previously grown in laboratory cultures (~3.03%). In Chapter II we observed that differences in genome size also happen between and within aquatic environments: MAGs retrieved from the water column of brackish and marine environments (averaging 2.97 Mbp and 3.10 Mbp, respectively) have smaller genome sizes than those retrieved from pelagic freshwaters (averaging 3.48 Mbp). Differences in genome size are also observed between benthic and pelagic prokaryotes in the Baltic Sea (averaging 3.47 Mbp and 2.97 Mbp, respectively). Interestingly, we found that genome size in both brackish environments correlated negatively with the metabolic potential involved in numerous functions, with the only exception of genes involved in the nitrogen cycle.
In the second part of my thesis work, we focused on freshwater prokaryotes to explore genome size variability. In Chapter III we sampled and sequenced 17 new metagenomes collected from eight different freshwater lakes in the Stockholm region, with a particular focus on lake Mälaren. In total, this project compiles a total of 2,378 MAGs (completeness >30% and contamination <10%) grouped into 514 species-clusters from 19 different phyla. This data, together with other re-binned MAGs and publicly available genomes, were compiled to study the relation between genome size, prevalence and relative abundance in Chapter IV. In this project, we included 80,561 genomes of medium-to-high quality (>50% completeness and contamination <5%) retrieved from ~590 publicly available BioProjects and research articles, and were clustered into 24,050 species-clusters. After competitive mapping against a dataset of 636 globally-distributed freshwater metagenomes, we detected the presence of 9,028 species-clusters on at least one metagenomic sample. Our results show that the estimated genome size correlates negatively with the prevalence and the average relative abundance, reflecting that prokaryotes with reduced genomes are present in a larger number of metagenomes, and at a higher relative abundance. Furthermore, we found that species-clusters with reduced genomes have a higher tendency to co-occur with other prokaryotes, probably in relation to strong metabolic dependencies. Lastly, in Chapter V we selected the genus Rhodoferax (phylum Pseudomonadota) as a case study to investigate the ecological implications of intragenus genome size variability. After subsetting the results from the previous chapter, we compiled 345 high-quality genomes (>90% completeness and contamination <5%) classified as Rhodoferax. These genomes clustered into 96 species-clusters, from which 80 were detected on at least one freshwater metagenome. We found that intragenus genome size ranged from 2.41 Mbp to 6.92 Mbp, and its variability was linked to the number and length of genomic islands, the metabolic potential, and the depth stratification of lakes.
The projects presented combine newly generated metagenomic data with the re-use of public archived data to provide new insights of the ecological implications of prokaryotic genome size variability. We used comparative (meta)genomics to analyze and compare genomes from numerous species-clusters and from various environments, with a specific focus on aquatic environments.
Place, publisher, year, edition, pages
Stockholm: Department of Ecology, Environment and Plant Sciences, Stockholm University , 2024. , p. 108
Keywords [en]
genome size, metagenomics, bacteria, archaea, microbial genomics, microbiome, freshwaters, Baltic Sea, Rhodoferax
National Category
Microbiology Ecology Bioinformatics and Systems Biology Genetics
Research subject
Ecology and Evolution
Identifiers
URN: urn:nbn:se:su:diva-234929ISBN: 978-91-8014-995-2 (print)ISBN: 978-91-8014-996-9 (electronic)OAI: oai:DiVA.org:su-234929DiVA, id: diva2:1908567
Public defence
2024-12-13, Vivi Täckholmsalen (Q-salen), NPQ-huset, Svante Arrhenius väg 20 and online via Zoom, public link will be available a week before the event, Stockholm, 09:00 (English)
Opponent
Supervisors
2024-11-202024-10-282024-11-11Bibliographically approved
List of papers