BCP Analysis Pipeline

A Nextflow-based pipeline for single-cell RNA sequencing data preprocessing, specifically designed for BCP (Billion Cell Program) analysis. The pipeline performs comprehensive single-cell data processing including alignment, ambient RNA removal, doublet detection, and basic quality control metrics.

Features

Single-cell alignment using STAR
Ambient RNA removal for cleaner expression profiles
Doublet detection to filter multiplets
Comprehensive QC metrics with MultiQC reporting
Embedded QC plots included in MultiQC report
Automated preprocessing with minimal manual intervention

Future Development

Parameters will be optimized and made configurable for different species and organ types to enhance pipeline flexibility and accuracy.

Environment Configuration

Current Setup

The pipeline currently requires manual conda environment configuration for testing and development purposes.

Planned Updates

Singularity container support will be implemented upon completion of the development phase to ensure better reproducibility and easier deployment across different computing environments.

Installation

# Clone the repository
git clone https://github.com/PhrenoVermouth/BCP_analysis.git
cd BCP_analysis

# Create conda environment
mamba env create -f bin/environment.yml

Usage

# Run the pipeline
nextflow run ~/BCP_analysis/main.nf -profile standard

Run modes

Genefull only (default):

nextflow run ~/BCP_analysis/main.nf --run_mode genefull

Velocity using prior GeneFull outputs: rerun in the same project directory after a completed GeneFull run. The pipeline will reuse ${outdir}/soupx/<sample>/<sample>_corrected.h5ad by default.
```
nextflow run ~/BCP_analysis/main.nf --run_mode velocity
```
If GeneFull outputs live elsewhere, optionally add a counts_h5ad column in samples.csv to point to custom .h5ad locations.

Scrublet threshold override

By default, Scrublet uses its automatic threshold.
To force a manual cutoff, pass --scrublet_manual_threshold when launching Nextflow; this value is forwarded to run_scrublet.py --manual_threshold.
```
nextflow run ~/BCP_analysis/main.nf --scrublet_manual_threshold 0.25
```

Mitochondrial filtering

Global threshold (default): --max_mito sets a single mitochondrial percentage cutoff for all samples (default: 0.2).
Per-sample overrides: provide --mito_max_map /path/to/file where each line maps one or more sample IDs to a cutoff using the format sample1, sample2 = value. Blank lines and lines starting with # are ignored. Any sample not listed will fall back to --max_mito.

Example (resource/AC.mito):
```
efm, em, fatfm, fatm, fbfm, fbm, hbfm, hbm, kdfm, kdm, lvfm, lvm, mbfm, mbm = 0.2
hfm, hm, ifm, im, pcf, pcm, skfm, skm = 0.6
lf, lm, smf, smm, spm, spfm = 0.4
```

Output

The pipeline generates:

Processed single-cell count matrices
Quality control reports via MultiQC
Doublet detection results
Ambient RNA removal metrics

Name		Name	Last commit message	Last commit date
Latest commit History 265 Commits
bin		bin
modules		modules
resource		resource
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md
SoupX_nextflow.rmd		SoupX_nextflow.rmd
main.nf		main.nf
multiqc_config.yaml		multiqc_config.yaml
nextflow.config		nextflow.config
samples.csv		samples.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BCP Analysis Pipeline

Features

Future Development

Environment Configuration

Current Setup

Planned Updates

Installation

Usage

Run modes

Scrublet threshold override

Mitochondrial filtering

Output

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

BCP Analysis Pipeline

Features

Future Development

Environment Configuration

Current Setup

Planned Updates

Installation

Usage

Run modes

Scrublet threshold override

Mitochondrial filtering

Output

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages