If you use this data, please cite:
Kutsenko, A., Svensson, T., Nystedt, B. et al. The Chironomus tentans genome sequence and the organization of the Balbiani ring genes. BMC Genomics 15, 819 (2014). https://doi.org/10.1186/1471-2164-15-819

The dipteran Chironomus tentans (C. tentans) and its Balbiani ring (BR) genes serve as a model system for eukaryotic gene expression studies. Kutsenko, A. et al. (2014), reports the first draft genome of C. tentans, characterizing its gene expression machinery and the genomic architecture of its BR genes.

In brief, genomic DNA was extracted and sequenced, resulting in an assembly size of 213 Mb, which was likely an overestimate due to allelic variants. The estimated genome size is around 200 Mb, with low GC content (31%) and repeat fraction (15%) compared to other dipterans. Phylogenetic analysis places it as a sister clade to mosquitoes, diverging 150-250 million years ago. The assembled genome was relatively fragmented (scaffold NG50=65 Kbp), but was still found to be reasonably complete regarding gene content, with 97% of 248 highly conserved core eukaryotic genes being represented.

For transcriptome sequencing and genome annotation, poly (A)+ RNA was extracted from various tissues and developmental stages. This data was used as evidence for ab initio predictions of gene models and alternative splice variants, resulting in a draft annotation of 15,120 predicted genes.

The C. tentans draft genome assembly can be downloaded here or from NCBI:

GenBank accession number: CBTT000000000.1

https://www.ncbi.nlm.nih.gov/assembly/GCA_000786525.1/

The draft genome annotation and the corresponding longest predicted proteins for each gene locus is provided here for download. Note that these preliminary annotations are provided as is, and incomplete, missing, or incorrect gene models are to be expected to some extent.

Acknowledgements

We acknowledge the Science for Life Laboratory and the National Genomics Infrastructure (NGI) for sequencing service. Computations were mainly performed on resources provided by SNIC through Uppsala Multidisciplinary Center for Advanced Computational Science (UPPMAX). Microscopy was performed at IFSU, Stockholm University. Ann-Charlotte Sonnhammer at BILS is acknowledged for assistance concerning the initial bioinformatics analysis. We thank Magnus Bjursell for initial support in the project. This work was financed by grants from The Knut and Alice Wallenberg Foundation through The Center for Metagenomic Sequence analysis (CMS), The Granholm’s Foundation, The Carl Trygger’s Foundation and The Swedish Research Council (VR).