SciLifeLab
Browse

Workflow for detection of microbial contamination coordinates in eukaryotic reference genomes

Version 3 2025-03-17, 15:56
Version 2 2025-03-11, 08:51
Version 1 2025-03-03, 12:41
workflow
posted on 2025-03-17, 15:56 authored by Nikolay OskolkovNikolay Oskolkov, Samantha Lopez ClintonSamantha Lopez Clinton, Chenyu Jin, Tom van der ValkTom van der Valk

This is a computational workflow for detecting coordinates of microbial-like sequences in eukaryotic reference genomes. The workflow accepts a reference genome in FASTA-format and outputs coordinates of microbial-like regions in BED-format. The workflow builds a Bowtie2 (https://bowtie-bio.sourceforge.net/bowtie2/index.shtml) index of the eukaryotic reference genome and aligns pre-computed microbial GTDB v.214 (https://gtdb.ecogenomic.org/) pseudo-reads to the reference, then custom scripts are used for detection of the positions of covered regions and quantification of most abundant microbial contaminants.

The workflow was developed by Nikolay Oskolkov, Lund University, Sweden, within the NBIS SciLifeLab long-term support project, PI Tom van der Valk, Centre for Palaeogenetics, Stockholm, Sweden.

If you use the workflow for your research, please cite our manuscript:

Nikolay Oskolkov, Chenyu Jin, Samantha López Clinton, Flore Wijnands, Ernst Johnson, Benjamin Guinet, Verena Kutschera, Cormac Kinsella, Peter D. Heintzman and Tom van der Valk, Disinfecting eukaryotic reference genomes to improve taxonomic inference from environmental ancient metagenomic data, manuscript in preparation

Questions regarding the workflow should be sent to nikolay.oskolkov@scilifelab.se

Funding

National Bioinformatics Infrastructure Sweden (NBIS)

Swedish Research Council

Find out more...

History

Publisher

Stockholm University

SciLifeLab acknowledgement

  • Bioinformatics platform (NBIS)

Usage metrics

    Science for Life Laboratory

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC