This repository provides a comprehensive Nextstrain analysis of Enterovirus A71. You can choose to perform either a VP1 run (>=600 base pairs), a P1 run (>=2000 base pairs) or a whole genome run (>=6400 base pairs).
For those unfamiliar with Nextstrain or needing installation guidance, please refer to the Nextstrain documentation.
This analysis would benefit from additional metadata, such as patient age, spatial data, and clinical outcomes. If you have relevant data and are willing to share, please contact us.
The data for this analysis is available from NCBI Virus. Instructions for downloading sequences are provided under Sequences.
This repository includes the following directories and files:
scripts: Custom Python scripts called by thesnakefile.snakefile: The entire computational pipeline, managed using Snakemake. Snakemake documentation can be found here.ingest: Contains Python scripts and thesnakefilefor automatic downloading of EV-A71 sequences and metadata.vp1: Sequences and configuration files for the VP1 run.P1: Sequences and configuration files for the P1 run.whole_genome: Sequences and configuration files for the whole genome run.
The config, vp1/config, and whole_genome/config directories contain necessary configuration files:
colors.tsv: Color schemegeo_regions.tsv: Geographical locationslat_longs.tsv: Latitude dataexclude.txt: Dropped strainsinclude.txt: Included strainsclades_genome.tsv: Virus clade assignmentsreference_sequence.gb: Reference sequenceauspice_config.json: Auspice configuration file
The reference sequence used is BrCr, accession number U22521, sampled in 1970.
Install the Nextstrain environment by following these instructions.
Activate the Nextstrain environment:
micromamba activate nextstrainTo perform a build, run:
snakemake all --cores 9 For specific builds:
- VP1 build:
snakemake auspice/enterovirus_A71_vp1.json --cores 9For tanglegrams, we can run the build on sub-alignments of the whole genome alignment. You can either run it for the specific genes or for the proteins P1, P2, P3.
- gene build:
snakemake all_genes --cores 9- Whole genome build:
snakemake all_proteins --cores 9Note
Version of augur: augur 30.0.1
Version of auspice: auspice 2.62.0
For more information on how to run the ingest, please refer to the README in the ingest folder.
To visualize the build, use Auspice:
auspice view --datasetDir auspiceTo run two visualizations simultaneously, you may need to set the port:
export PORT=4001Sequences can be downloaded manually or automatically.
- Manual Download: Visit NCBI Virus, search for
EV-A71or Taxid39054, and download the sequences. - Automated Download: The
ingestfunctionality, included in the mainsnakefile, handles automatic downloading.
For questions or comments, contact me via GitHub or nadia.neuner-jehle@swisstph.ch