SciLifeLab
Browse

MRSA case study example data

educational resource
posted on 2023-03-10, 12:39 authored by John SundhJohn Sundh
<p>### Dataset description<br> <br> This dataset contains fastq files for three Illumina HiSeq runs of an<br> RNA-seq analysis (see Osmundson J et al., PLoS One, 2013;8(10):e76572)<br> <br> This data is used as a case-study for the Tools in Reproducible Research<br> course. We have previously used `fastq-dump` from the `sra-tools` package<br> to download a subsampled set of sequences from the Sequence Read Archive<br> (SRA). However,recently sra-tools has become very unreliable due to some<br> certificate/security issue when downloading from the National Center for<br> Biotechnology Information (NCBI). We have therefore created this dataset to<br> use as an alternative starting point for the course case-study.<br> <br> All three files were generated on the Rackham compute cluster by installing  <br> sra-tools (v.3.0.3) from the bioconda channel:<br> <br> ```<br> mamba create -n sra-tools -c bioconda sra-tools<br> conda activate sra-tools<br> ```<br> <br> then running:<br> <br> ```<br> fastq-dump SRR935090 -X 100000 --gzip -Z > SRR935090.fastq.gz<br> fastq-dump SRR935091 -X 100000 --gzip -Z > SRR935091.fastq.gz<br> fastq-dump SRR935092 -X 100000 --gzip -Z > SRR935092.fastq.gz<br> ```<br> <br> Thus, each file contains a subset of 100,000 reads for each sample downloaded<br> from the original data found in the SRA archive. The original data contains<br> between 76.3 - 176.6 million reads. The idea is to let the students download<br> these subsampled files directly or as part of bioinformatic workflows taught<br> during the course.</p>

Funding

National Bioinformatics Infrastructure Sweden (NBIS)

Swedish Research Council

Find out more...

History

Publisher

National Bioinformatics Infrastructure Sweden (NBIS)

Usage metrics

    National Bioinformatics Infrastructure Sweden (NBIS)

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC