7 files

Metagenomic dataset from Swedish urban lakes

posted on 2023-11-02, 12:04 authored by Sarahi GarciaSarahi Garcia, Justyna J. Hampel

We release metagenomic data of seven urban, eutrophic Swedish lakes that have been extensively studied and characterized in terms of biogeochemistry. Here we provide the supplementary tables and the full set of metagenome-assembled genomes of 17 metagenomic samples.

- We have 10 metagenomic samples from Lake Mälaren which is the third largest lake in Sweden and serves as the main drinking water supply to ~1.5 million residents in Stockholm and neighboring cities. This dataset is the first insight into the bacterial dynamics of Lake Mälaren during the summer seasonal gradient (early July – late August 2021).

- We sampled two additional lakes (Trehoringen and Langsjön) in Uppsala to compare summer microbial communities in lakes in Central Sweden.

- Additionally, we sequenced, assembled and binned microbial communities in five Swedish lakes in the urban Stockholm-Uppsala region from 2002 that were previously characterized using older techniques (Eiler and Bertilsson, 2004). These five highly eutrophic lakes have a long history of seasonal cyanobacterial blooms in the summer (Eiler and Bertilsson, 2004).

Our goal was to collect and sequence data that could be used to generate hypothesis and preliminary view of microbial community or urban lakes around Stockholm with special emphasis on Lake Mälaren which proved drinking water to millions of people. In these data we release 17 shot-gun metagenomes in the European Nucleotide Archive (ENA) and are accessible under project number PRJEB54817. Moreover, in this repository we provide the 2378 MAGs that include the 514 species representative genomes (SRG, calculated <95% average nucleotide identity - ANI).

Finally, here you also can find the 4 supplementary tables that include the following information.

· Table S1. Sample metadata

· Table S2. Metagenomic assembly statistics (QUAST)

· Table S3. List of metagenome assembled genomes (MAGs), checkM statistics and GTDB taxonomic classification. Dereplicated species representative genomes are marked with x

· Table S4. Species representative genomes (SRGs) abundance in copies per million reads


Wenner-Gren UPD2020-0040

Wenner-Gren UPD2021-0051

SciLifeLab Fellowship

Albert and Maria Bergstrom Foundation



Stockholm University

Contact email

Usage metrics

    Science for Life Laboratory