SciLifeLab
Browse
analyses.tar.gz (548.93 MB)

Genomic annotations and comparative analysis

Download (548.93 MB)
workflow
posted on 2022-10-06, 07:56 authored by Karl DyrhageKarl Dyrhage, Christian SeegerChristian Seeger

Scripts and additional data necessary for the analyses performed in the paper "Genome Evolution of a Symbiont Population for Pathogen Defence in Honeybees". Information about specific analyses can be found in the corresponding README file, when applicable.


Directories are named after the analysis in the paper:

fig02_phylogeny

Contains a pipeline written in Snakemake that takes a set of genome annotations in the GenBank format, groups all protein sequences with OrthoMCL, filters out genes predicted to be recombinant by Phipack, creates a trimmed concatenation of the remaining single-copy panorthologs, and reconstructs a phylogeny using IQ-TREE. The concatenated protein sequences used in the paper are included (singlecopypanorthologs.fasta).

fig03_pangenome

Contains a script for analysing the species pangenome, using the OrthoMCL clustering created in fig02_phylogeny.


fig04_Sfig03_transposons

Contains a script for plotting the location of transposons within A. kunkeei genomes.

fig05_ExEs

Contains scripts showing the workflow that was used to select plasmid assemblies in the study.

fig06_LPxTG

Contains a script for plotting the presence/absence-patterns of genes containing cell surface-binding LPxTG motifs in A. kunkeei strains. 

fig07_ExEs_growth

Contains a script for plotting the presence/absence of extrachromosomal elements in A. kunkeei strains.

phageplasmid_classification

Contains a script showing the workflow used to classify two phage-plasmids present in the A. kunkeei population, using data from Pfeifer et al. (2021).

https://doi.org/10.1093/nar/gkab064

prokka_annotations

Contains a script that runs prokka twice, once with the standard databases and once using a manually curated A. kunkeei annotation, then combines the result.

tableS4_orthogroups

Contains a script that compiles results from the analyses described above for each orthogroup (OrthoMCL results from fig02_phylogeny) and writes it as a table. 

tableS5_ANI

Contains a script for calculating average nucleotide identities between A. kunkeei genomes.

Sfig02_growth

Contains a script for plotting growth curves for A. kunkeei strain H3B2-03M.

Sfig04_defence

Contains a script for plotting the location of a genomic defence island in A. kunkeei, which either contains a CRISPR-CAS system of a restriction-modification system.

table_16S

Contains a script for calculating 16S identities between A. kunkeei strains.

data

Contains results from EggNOG, and Phaster, as well as some of the output from the analyses above. Specifically, it contains the annotations from prokka_annotations in GenBank format, the OrthoMCL clustering and the phylogeny in Nexus file format from fig02_phylogeny, which are used as input in other analyses.

History

Publisher

Uppsala University

Usage metrics

    Science for Life Laboratory

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC