DAGIP: A bias correction algorithm for cell-free DNA data

Given two groups of matched (preferentially paired) samples sequenced under different protocols, the tool explicitly learns the bias using a neural network. The approach builds on Optimal Transport theory, and exploits sample-to-sample similarities to define how to perform bias correction.

Documentations are available at jorisvermeeschlab.github.io/DAGIP/. The source code has also been archived on Zenodo: https://zenodo.org/records/14503340.

Install the tool

Install dependencies:

pip install -r requirements.txt

To reproduce the results of our manuscript, rpy2 should be installed, as well the following R packages: dplyr, GenomicRanges and dryclean.

Install the package:

python setup.py install --user

Basic usage:

from dagip.core import DomainAdapter

# Let X and Y be two numpy arrays, where the rows are (coverage, methylation, fragmentomic) profiles and columns are features (e.g., DMRs, bins). Y and X have been produced under sequencing protocols 1 and 2, respectively.

# Build a model from the matched groups
model = DomainAdapter()
X_adapted = model.fit_transform(X, Y)

# X_adapted is a numpy array of same dimensions as X

# Save the model
model.save('some/location.pt')

...

# Load the model
model.load('some/location.pt')

# Perform bias correction on new independent samples from sequencing protocol 2
X_new_adapted = model.transform(X_new)

For more advanced usage, please check the documentation.

Reproduce the results from the manuscript

To reproduce the results presented in our paper, first download the datasets from the FigShare repository (DOI: 10.6084/m9.figshare.24459304) and place its content in the data folder. The repository is available at https://figshare.com/s/5f837a93ea2719ffcaf9.

Preprocess the data:

python data/preprocess.py
python data/50kb-to1mb.py
python data/to-numpy.py

Cancer detection

python scripts/validate-cnas.py HL
python scripts/validate-cnas.py DLBCL
python scripts/validate-cnas.py MM
python scripts/validate-cnas.py OV
python scripts/validate-fragmentomics-multimodal.py

Cross-validation with paired samples

python scripts/identify-pairs.py OV-forward
python scripts/identify-pairs.py NIPT-chemistry
python scripts/identify-pairs.py NIPT-lib
python scripts/identify-pairs.py NIPT-adapter
python scripts/identify-pairs.py NIPT-hs2000
python scripts/identify-pairs.py NIPT-hs2500
python scripts/identify-pairs.py NIPT-hs4000

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
dagip		dagip
data		data
docs		docs
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DAGIP: A bias correction algorithm for cell-free DNA data

Install the tool

Basic usage:

Reproduce the results from the manuscript

Cancer detection

Cross-validation with paired samples

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

JorisVermeeschLab/DAGIP

Folders and files

Latest commit

History

Repository files navigation

DAGIP: A bias correction algorithm for cell-free DNA data

Install the tool

Basic usage:

Reproduce the results from the manuscript

Cancer detection

Cross-validation with paired samples

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages