-
Notifications
You must be signed in to change notification settings - Fork 7
FAQ
Donghoon Lee edited this page Feb 24, 2021
·
1 revision
How can I reproduce the result in the ENCODE project?
To reproduce the result in the ENCODE project portal, you will need to download several BAM files as well as covariates, chrom.size, and blacklist files (see https://github.com/gersteinlab/starrpeaker). We will use K562 for this example.
STARR-seq experiment consists of (input) DNA library and (output) RNA library. You will need to download both input and output BAM files from the ENCODE portal.
- K562 input: https://www.encodeproject.org/files/ENCFF807BAQ/
- K562 output rep 1: https://www.encodeproject.org/files/ENCFF672URE/
- K562 output rep 2: https://www.encodeproject.org/files/ENCFF503CJW/
Suppose you downloaded these input files:
- K562_input.bam
- K562_r1.bam
- K562_r2.bam
- hg38.chrom.sizes
- GRCh38.blacklist.bed
- cov1.bw
- cov2.bw
- cov3.bw
Then you can run commands like this to run starrpeaker:
#rep1
starrpeaker --prefix starrpeaker-testrun_k562_r1 --chromsize hg38.chrom.sizes --blacklist GRCh38.blacklist.bed --cov cov1.bw cov2.bw cov3.bw --input K562_input.bam --output K562_r1.bam --threshold 0.05
# rep2
starrpeaker --prefix starrpeaker-testrun_k562_r2 --chromsize hg38.chrom.sizes --blacklist GRCh38.blacklist.bed --cov cov1.bw cov2.bw cov3.bw --input K562_input.bam --output K562_r2.bam --threshold 0.05