Skip to content

perslab/DEPICT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

224 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DEPICT

The following description explains how to download DEPICT, test run it on example files and how to run it on your GWAS summary statistics.

Installation via Docker (Recommended)

DEPICT can be run fully containerized using Docker. This avoids all local dependency issues (Python 2.7, Java, PLINK 1.9) and guarantees a fully reproducible environment.

1. Pull the prebuilt image

Install docker and pull the docker file:

bash docker pull avanhilten/depict:py2

2. Run DEPICT with mounted working directory

docker run --rm -it \
  -v $(pwd):/data \
  avanhilten/depict:py2

This mounts your current directory into the container at /data. All result files generated by DEPICT will appear in your local working directory.

Note

Apple Silicon (ARM) Users: If you are on an Apple M1/M2/M3 system, run:

docker run --rm -it \
 --platform linux/amd64 \
 -v $(pwd):/data \
 avanhilten/depict:py2

3. Run the example analysis

Inside the container, run:

/opt/depict/src/python/depict.py /opt/depict/example/ldl_teslovich_nature2010.cfg

Upon completion, you should see output files such as:

  • *_geneprioritization.txt
  • *_genesetenrichment.txt
  • *_tissueenrichment.txt
  • *_loci.txt

in your local directory.

See the Wiki for a description of the output format).

  • DEPICT loci ldl_teslovich_nature2010_loci.txt
  • Gene prioritization results ldl_teslovich_nature2010_geneprioritization.txt
  • Gene set enrichment results ldl_teslovich_nature2010_genesetenrichment.txt
  • Tissue enrichment results ldl_teslovich_nature2010_tissueenrichment.txt

Warning

No files are written inside the container itself. The folder /data inside the container = your local working directory All configuration files, GWAS inputs, and results should live in this folder

Run DEPICT based on your GWAS

The following steps allow you to run DEPICT on your GWAS summary statistics. We advice you to run the above LDL cholesterol example before this point to make sure that you meet all the necessary dependencies to run DEPICT.

  1. Make sure that you use hg19 genomic SNP positions
  2. Make an 'analysis folder' in which your trait-specific DEPICT analysis will be stored
  3. Copy the template config file from src/python/template.cfg to your analysis folder and give the config file a more meaningful name
  4. Edit your config file
  • Point analysis_path to your analysis folder. This is the directory to which output files will be written
  • Point gwas_summary_statistics_file to your GWAS summary statistics file. This file can be either in plain text or gzip format (i.e. having the .gz extension)
  • Specify the GWAS association p value cutoff (association_pvalue_cutoff). We recommend using 5e-8 or 1e-5
  • Specify the label, which DEPICT uses to name all output files (label_for_output_files)
  • Specify the name of the association p value column in your GWAS summary statistics file (pvalue_col_name)
  • Specify the name of the marker column (marker_col_name). Format: chr:pos, ie. '6:2321'. If this column does not exist chr_col and pos_col will be used, then leave if empty
  • Specify the name of the chromosome column (chr_col_name). Leave empty if the above marker_col_name is set
  • Specify the name of the position column (pos_col_name). Leave empty if the above marker_col_name is set. Please make sure that your SNP positions used human genome build GRCh37 (hg19)
  • Specify the separator used in the GWAS summary statistics file (separator). Options are
    • tab
    • comma
    • semicolon
    • space
  1. Run DEPICT
  • <path to DEPICT>/src/python/depict.py <path to your config file>
  1. Investigate the results which have been written to your analysis folder. See the Wiki for details on the output format
  • Associated loci in file ending with _loci.txt
  • Gene prioritization results in file ending with _geneprioritization.txt
  • Gene set enrichment results in file ending with _genesetenrichment.txt
  • Tissue enrichment results in file ending with _tissueenrichment.txt

Troubleshooting

Please send the log file (ending with _log.txt) with a brief description of the problem to Tune H Pers (tune.pers@sund.ku.dk).

The overall version of DEPICT follows the DEPICT publications. The current version is v1 from Pers, Nature Communications, 2015 and the release follows the number of commits of the DEPICT git repository (git log --pretty=format:'' | wc -l). The latest 1000 Genomes Project pilot phase DEPICT version is rel138, the latest 1000 Genomes Project phase 3 version is rel137.

How to cite

Pers, Nature Communications 2015

1000 Genomes Project, because DEPICT makes extensively use of their data.

Data used in these examples

LDL GWAS summary statistics from Teslovich, Nature 2010 are used as input in this example. We included all SNPs with P < 5e-8 and manually added chromosome and position columns (hg19/GRCh37).

1000 Genomes Consortium pilot release and phase 3 release data are used in DEPICT. Please remember to cite their paper in case you use our tool.

About

DEPICT code, instructions and an example

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •