SciLifeLab
Browse
1/1
3 files

3. Comparative population transcriptomics in krill: orthogroups (FASTA, TSV files)

dataset
posted on 2023-10-19, 13:31 authored by Andreas WallbergAndreas Wallberg

This item contains a gzipped archive with ~13,000 orthogroups used to study molecular evolution in this project.

Archive:

krill.orthogroups.tar.gz

Contents of archive (FILE,SIZE,SPECIES,SAMPLES,SNPs):

  1. krill.proteinortho.tsv - the primary output table from Proteinortho. Describes which protein sequences from which species belong to the same orthogroup. Format according to the standard output of the program.
  2. krill.proteinortho.tsv.seqs.csv - a processed table that also contains the actual sequences line by line (see below).
  3. the alignments directory, which contains all OGs in unaligned and aligned files in FASTA format (see below).

Format of the krill.proteinortho.tsv.seqs.csv table

The fields are:

  1. NR = orthogroup number
  2. ORTHO_GROUP = orthogroup ID
  3. N_SPECIES = the number of species
  4. N_GENES = the number of genes/sequences in this orthogroup
  5. N_MATCHING[o] = number of sequences matching outgroup species for this orthogroup
  6. N_NON_MATCHING = number of sequences matching ingroup species for this orthogroup
  7. HEADER = the name of this particular sequence
  8. SEQ = the protein sequence

Contents of the alignments directory

Each orthogroup is represented by up to four FASTA files:

  1. OG*.cds.ginsi.fasta.orig = the original, unaligned and unfiltered sequences
  2. OG*.cds.ginsi.fasta = the aligned and filtered sequences
  3. OG*.cds.ginsi.fasta.without_cold_euphausia.fasta = the aligned and filtered sequences after removing cold-associated Euphausia species
  4. OG*.cds.ginsi.fasta.without_cold_thysanoessa.fasta = the aligned and filtered sequences after removing cold-associated Thysanoessa species

Funding

Local adaptation and genome evolution in crustacean zooplankton: how does size matter?

Swedish Research Council

Find out more...

History

Publisher

Uppsala University

Access request email

andreas.wallberg@imbim.uu.se

Usage metrics

    Andreas Wallberg Lab

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC