FRESH-MAP dataset: study on the ecological success of streamlined aquatic microorganisms
We release the FRESH-MAP dataset, a compilation of 9028 prokaryotic species-clusters (ANI >95%) detected on a set of 636 globally-distributed freshwater metagenomes after competitive mapping. The main goal of our study was to provide the first systematic evaluation of the ‘Black Queen Hypothesis’ on a global scale based on aquatic metagenomic datasets. In this repository, we provide the supplementary tables and the full set of genomes. Moreover, we include all 9028 representative genomes (FRESH-MAP dataset) as a zip file (FreshMap_dataset.zip). You can also find 12 supplementary tables that include the following information.
- Table S1: Genomic statistics of all 80561 medium-to-high quality genomes (>50% completeness and <5% contamination) used in the study. We include publication reference, completeness, contamination, GTDB-tk taxonomy, type of genome, environment of origin, etc.
- Table S2: Genomic information of all 24050 species-clusters (ANI >95%), including the best representative genome. We include origin of the species-clusters (freshwater, non-freshwater or mixed) and the type of genome (omics, isolate or both). Genome stats here included refer to the best representative genome of each species-cluster.
- Table S3: List of all 636 freshwater genomes used for competitive mapping. We include accession numbers, reference of publication and metadata.
- Table S4: Mapping statistics after trimming. We include the total number of reads, number of mapped reads, average length of the reads (bp), and GC content (%).
- Table S5: Relative abundance results for each of the 24050 species-clusters (ANI >95%) across the freshwater metagenomes.
- Table S6: List of 1202 representative genomes part of the co-occurrence network, including information on the degree of connectedness and cohort.
- Table S7: Recruitment of species-clusters per phyla across the different cohorts. Yellow indicates phyla linked to a specific cohort.
- Table S8: Completeness (ranging from 0 to 1) of all KEGG modules involved in biosynthesis of amino acids, nucleotides and vitamins for each of the 9028 species-clusters (ANI > 95%) detected on ≥1 freshwater metagenomes.
- Table S9: Number of copies of each KEGG KO involved in flagellar, sigma factors, two-component systems, carbon fixation, nitrogen cycle, and sulfur cycle for each of the 9028 species-clusters (ANI > 95%) detected on ≥1 freshwater metagenomes.
- Table S10: Metadata of the two newly sequenced metagenomic samples from Stadsträdgården, Uppsala (Sweden), including accession numbers.
- Table S11: Metagenomic samples used for re-binning from the StratfreshDB (Buck et al., 2021), including ENA accession numbers.
- Table S12: Genomic statistics, including completeness and contamination of the re-binned (n = 11146) and original (n = 7837) MAGs.