<p dir="ltr">This item contains a gzipped archive with ~13,000 orthogroups used to study molecular evolution in this project.</p><p dir="ltr"><b>Archive:</b></p><p dir="ltr">krill.orthogroups.tar.gz</p><p dir="ltr"><b>Contents of archive (FILE,SIZE,SPECIES,SAMPLES,SNPs):</b></p><ol><li><b>krill.proteinortho.tsv</b> - the primary output table from Proteinortho. Describes which protein sequences from which species belong to the same orthogroup. Format according to the standard output of the program.</li><li><b>krill.proteinortho.tsv.seqs.csv</b> - a processed table that also contains the actual sequences line by line (see below).</li><li>the <b>alignments</b> directory, which contains all OGs in unaligned and aligned files in FASTA format (see below).</li></ol><p dir="ltr"><b>Format of the krill.proteinortho.tsv.seqs.csv table</b></p><p dir="ltr">The fields are:</p><ol><li>NR = orthogroup number</li><li>ORTHO_GROUP = orthogroup ID</li><li>N_SPECIES = the number of species</li><li>N_GENES = the number of genes/sequences in this orthogroup</li><li>N_MATCHING[o] = number of sequences matching outgroup species for this orthogroup</li><li>N_NON_MATCHING = number of sequences matching ingroup species for this orthogroup</li><li>HEADER = the name of this particular sequence</li><li>SEQ = the protein sequence</li></ol><p dir="ltr"><b>Contents of the alignments directory</b></p><p dir="ltr">Each orthogroup is represented by up to four FASTA files:</p><ol><li>OG*.cds.ginsi.fasta.orig = the original, unaligned and unfiltered sequences</li><li>OG*.cds.ginsi.fasta = the aligned and filtered sequences</li><li>OG*.cds.ginsi.fasta.without_cold_euphausia.fasta = the aligned and filtered sequences after removing cold-associated <i>Euphausia</i> species</li><li>OG*.cds.ginsi.fasta.without_cold_thysanoessa.fasta = the aligned and filtered sequences after removing cold-associated <i>Thysanoessa</i> species</li></ol>
Funding
Local adaptation and genome evolution in crustacean zooplankton: how does size matter?