This repository demonstrates a minimal working example of aligning RNA-Seq reads to a toy genome using STAR inside a Docker container.
Dockerfile: Installs STAR and dependenciesscripts/run_star.sh: Runs genome generation and alignmentdata/: Toy genome, GTF, and FASTQ filesoutput/: Alignment results (not tracked by Git)
docker build -t star-docker .docker run --rm
-v $(pwd)/output:/app/output
-v $(pwd)/data:/app/data
-v $(pwd)/scripts:/app/scripts
star-docker bash scripts/run_star.sh
toy_genome.fa: A tiny artificial reference genometoy_genes.gtf: A GTF annotation file with a single exonsample.fastq: A single-end FASTQ file with 1 read
After running the pipeline, the output/ directory will contain:
Aligned.sortedByCoord.out.bam: Final aligned BAM fileLog.out: STAR alignment logSJ.out.tab: Splice junction output (empty for toy example)
- The FASTQ file must have equal-length sequence and quality lines
- The GTF file must contain
exonfeatures with coordinates that match the FASTA reference - STAR is compiled and run inside the container using Ubuntu 20.04
MIT License. Use and modify freely.