Create a reference genome sequence with "injected" SNPs/SNVs from other mouse strains.
- mouse reference genome
- mouse annotation (GTF)
- SNP file
The snpsplit tool is defined in a conda environment in workflow/envs/snpslit.yml.
CellRanger is proprietary software and cannot be installed via Conda. It needs to be manually downloaded and the path
to its executable stored in workflow/config.yaml.
On the DKFZ Compute Cluster Cellranger is installed as module and available at:
/software/cellranger/6.0.0/bin/cellranger mkref
The configuration in workflow/config.yaml contains:
- path to reference genome
- path to gene annotation
- path to
cellranger mkref - list of Strains from the mouse genome project (MGP)
The workflow can be run by executing:
snakemake -c1 --configfile config.yaml --use-condaTo use the LSF job scheduler on the DKFZ cluster run with
snakemake --cluster "bsub -n4 -q verylong -R rusage[mem=100GB]" -p -j2 -c4 --configfile config.yaml --use-condaOutput files of the workflow are stored in the subdirectory output/
- Add a download function for the SNP data
from
ftp.sanger.ac.uk/pub/REL-1505-SNPs_Indels/mgp.v5.merged.snps_all.dbSNP142.vcf.gzThis is currently not activated due to proxy issues in the DKFZ working.