SciLifeLab
Browse
.GZ
CO1_asv_counts_SE.tsv.gz (43.73 MB)
.GZ
CO1_asv_counts_MG.tsv.gz (25.1 MB)
.GZ
CO1_asv_seqs_SE.fasta.gz (89.45 MB)
.GZ
CO1_asv_seqs_MG.fasta.gz (82.73 MB)
DATASET
samples_metadata_malaise_SE.tsv (620.08 kB)
DATASET
samples_metadata_malaise_MG.tsv (304.31 kB)
DATASET
samples_metadata_soil_litter_SE.tsv (32.81 kB)
DATASET
samples_metadata_litter_MG.tsv (15.89 kB)
DATASET
sites_metadata_SE.tsv (22.66 kB)
DATASET
sites_metadata_MG.tsv (5.21 kB)
DATASET
biological_spikes_taxonomy_MG.tsv (0.8 kB)
DATASET
biomass_count_IBA.tsv (0.62 kB)
.TSV
biomass_count_SIIP.tsv (59.71 kB)
DATASET
soil_chemistry_SE.tsv (22.6 kB)
DATASET
soil_chemistry_MG.tsv (40.43 kB)
DATASET
stand_characteristics_MG.tsv (139.23 kB)
DATASET
CO1_sequencing_metadata_MG.tsv (477.89 kB)
DATASET
CO1_sequencing_metadata_SE.tsv (1.26 MB)
DATASET
synthetic_spikes_info.tsv (1.02 kB)
DATASET
biological_spikes_taxonomy_SE.tsv (0.92 kB)
1/0
23 files

Amplicon sequence variants from the Insect Biome Atlas project

Version 6 2024-11-25, 19:00
Version 5 2024-10-24, 20:45
Version 4 2024-10-21, 14:44
Version 3 2024-10-15, 08:31
Version 2 2024-05-17, 07:34
Version 1 2024-05-06, 08:32
dataset
posted on 2024-11-25, 19:00 authored by Andreia MiraldoAndreia Miraldo, Elzbieta Iwaszkiewicz-EggebrechtElzbieta Iwaszkiewicz-Eggebrecht, John SundhJohn Sundh, Lokeshwaran ManoharanLokeshwaran Manoharan, Emma GranqvistEmma Granqvist, Anders AnderssonAnders Andersson, Piotr Łukasik, Tomas Roslin, Ayco J. M. Tack, Fredrik RonquistFredrik Ronquist

General information

The Insect Biome Atlas project was supported by the Knut and Alice Wallenberg Foundation (dnr 2017.0088). The project analyzed the insect faunas of Sweden and Madagascar, and their associated microbiomes, mainly using DNA metabarcoding of Malaise trap samples collected in 2019 (Sweden) or 2019–2020 (Madagascar).

Please cite this version of the dataset as: Miraldo A, Iwaszkiewicz-Eggebrecht E, Sundh J, Lokeshwaran M, Granqvist E, Andersson AF, Lukasik P, Roslin T, Tack A, Ronquist F. 2024. Dataset of amplicon sequence variants (ASVs) from the Insect Biome Atlas Project, version 5. https://doi.org/10.17044/scilifelab.25480681

Dataset description

This dataset contains amplicon sequence variants (ASVs) generated from high-throughput sequencing of the cytochrome c oxidase subunit I (COI) gene from Malaise trap samples (lysates, homogenates and preservative ethanol) and soil and litter samples. It includes ASV sequences and abundance information (number of reads) as well as metadata files that are needed to interpret and analyse the data further. Future versions of the dataset will include additional data. NB! All ASV files include ASVs that represent biological and synthetic spike-ins.

Methods

Samples were sequenced using Illumina technology. Raw data are available at the European Nucleotide Archive (ENA) under project PRJEB61109. The raw sequence data was preprocessed using a Snakemake workflow. Preprocessed reads were then used as input to the AmpliSeq Nextflow (v.2.1.0) pipeline to generate ASVs.

Available data

Two types of files are provided: ASV files and metadata files. Files marked with 'SE' and 'MG' contain data from Sweden and Madagascar, respectively.

The file shasum.txt contains checksums for each of the files.

ASV files

ASV sequences in fasta format are found in files CO1_asv_seqs_SE.fasta.gz and CO1_asv_seqs_MG.fasta.gz. Counts of ASVs in each sample are in CO1_asv_counts_SE.tsv.gz and CO1_asv_counts.MG.tsv.gz. The Swedish dataset contains 821,559 ASVs in 6,169 samples. The Madagascar dataset contains 701,769 ASVs in 2,286 samples.

Metadata files

Four types of metadata files are included:

  1. sequencing_metadata files with information about samples that were processed in the lab and sequenced
  2. samples_metadata files with information about samples that were collected in the field.
  3. sites_metadata files with information about sites where samples were collected.
  4. sipke-ins metadata files with information about spike-ins added to each malaise trap sample at the time of sample processing in the lab.

Sequencing metadata files

The two sequencing metadata files CO1_sequencing_metadata_SE.tsv and CO1_sequencing_metadata_MG.tsv contain information about samples that were sequenced. For details on the columns of these files, see the README.txt file.

Samples metadata files

Four samples_metadata files are included in this dataset with information about each sample that was collected in the field. For samples collected with malaise traps we have two files, one for each country: samples_metadata_malaise_SE.tsv and samples_metadata_malaise_MG.tsv. See the README.txt file for details about the columns of these files.

For arthropod samples collected from litter and soil we have two files, one for each country: samples_metadata_soil_litter_SE.tsv and samples_metadata_litter_MG.tsv. Note that for Madagascar we did not collect arthropod samples from soil. Also note that for Madagascar we collected four leaf litter samples at each trap location, one sample in each direction of the Malaise trap (front, back, left and right); whilst for Sweden we collected only one sample at each trap location. For details on the columns of these files, see the README.txt file.

Sites metadata files

There are two files that contain information about sampling sites, one for each country: sites_metadata_SE.tsv and sites_metadata_MG.tsv. See the README.txt file for more information.

Spike-ins metadata files

We provide three files with information about spike-ins used when processing samples in the lab: biological_spikes_taxonomy_SE.tsv and biological_spikes_taxonomy_MG.tsv contain taxonomic information on biological spike ins while the file synthetic_spikes_info.tsv has information on synthetic spike ins. See README.txt for more information.

Other complementary data files

We present complementary data on soil chemistry collected at each sampling location in both Sweden and Madagascar, stand characteristics collected at each sampling location in Madagascar and biomass/count data for a selected number of malaise trap samples from the Insect Biome Atlas project (n=24) and the Swedish Insect Inventory Project (n=224).

Soil chemistry data

We provide two datasets, one for each country, on soil chemistry (soil_chemistry_SE.tsv and soil_chemistry_MG.tsv) that store information on soil nutrients from soil samples collected at the same sampling sites as the arthropod communities. Topsoil (0-20cm) was sampled at 5 sites around each Malaise trap in both Sweden and Madagascar: one soil core (6 cm diameter) at the center of trap and one soil core on each of the four “sides” of the trap five meters away from the trap. Soil samples at each site were taken as composite samples from the five locations. Soil samples collected in Sweden were analysed at Eurofins in Sweden and the ones collected in Madagascar were analysed at the Laboratoire des Radioisotopes in Madagascar. As samples from each country were analysed at different laboratories the variables on soil nutrients presented in each dataset differ slightly. See the README.txt file for more information on the columns of each of these files.

Stand characteristics data

Standing characteristics were only measured in Madagascar as extensive data on landscape composition and vegetation structure at the sampling sites in Sweden had already been compiled as part of the National Inventory of Landscapes in Sweden (NILS) and data are publicly available here.

The file stand_characteristics_MG.tsv contains information on a set of standing characteristics from Madagascar related to tree density (DBH, shading, etc). Information about columns in this file is found in the README.txt file.

Biomass and count data

To allow an assessment of how the biomass of a Malaise trap sample translates to the number of specimens, we provide two files describing samples from Sweden, for which we measured the biomass and also counted all the specimens in the sample. The first set comprises 24 samples from the IBA field campaign (biomass_count_IBA.tsv), and the second set comprises 224 samples from a separate Swedish Malaise trapping campaign (Swedish Insect Inventory Project) in 2018–2019 (biomass_count_SIIP.tsv). For the latter dataset, we provide the site and sample metadata in the same file. Details about columns in these files are found in the README.txt file.

References:

Egnér, H., Riehm, H., & Domingo, W. (1960). Untersuchungen über die chemische Bodenanalyse als Grundlage für die Beurteilung des Nährstoffzustandes der Böden. II. Chemische Extraktionsmethoden zur Phosphor-und Kaliumbestimmung. Kungliga Lantbrukshögskolans Annaler, 26, 199–215.

Funding

Insect Biome Atlas

Knut and Alice Wallenberg Foundation

Find out more...

National Bioinformatics Infrastructure Sweden (NBIS)

Swedish Research Council

Find out more...

History

Publisher

Naturhistoriska riksmuseet

Usage metrics

    Insect Biome Atlas

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC