Releases: FoxoTech/methylprep
Releases · FoxoTech/methylprep
v1.7.0 100% sesame/minfi comptable and works with parquet format
What's Changed
- Feature/v1.7.0 -- adds parquet support by @marcmaxson in #114
- Feature/v1.6.3 by @marcmaxson in #108
- Feature/v1.6.1 -- minor bug fixes by @marcmaxson
- Feature/v1.5.7 -- unlocks pandas (v 1.3.0+ supported now) by @marcmaxson in #96
- patch to help read_geo_processed.py detect txt csv files better by @marcmaxson in #88
- Update pOOBAH pval calculation by @notmaurox in #107
- correct both addressA and addressB for OOB probes when channel swapping by @notmaurox in #111
- Addition of negative control based pvalue calculation by @notmaurox in #109
- Bump lxml from 4.7.1 to 4.9.1 by @dependabot in #113
- Bump urllib3 from 1.26.4 to 1.26.5 by @dependabot in #86
New Contributors
- @dependabot made their first contribution in #86
- @notmaurox made their first contribution in #107
Full Changelog: v1.5.0...v1.7.0
version 1.5.0 - complete mouse array support and sesame/minfi compatability confirmed
v1.5.0
- MAJOR refactor/overhaul of all the internal classes. This was necessary to fully support the mouse array.
- new SigSet class object that mirror's sesame's SigSet and SigDF object.
- Combines idats, manifest, and sample sheet into one object that is inherited by SampleDataContainer
- RawDataset, MethylationDataset, ProbeSubtype all deprecated and replaced by SigSet
- SampleDataContainer class is now basically the SigSet plus all pipeline processing settings
- new mouse manifest covers all probes and matches sesame's output
- Processing will work even if a batch of IDATs have differing probe counts for same array_type, though those
differing probes in question may not be saved. - unit tests confirm that methylprep, sesame, and minfi beta values output match to within 1% of each other now. Note that the intermediate stages of processing (after NOOB and after DYE) do not match
with sesame in this version. Can be +/- 100 intensity units, likely due to differences in order of
steps and/or oob/mask probes used.
- new SigSet class object that mirror's sesame's SigSet and SigDF object.
GEO support, composite data sets, meta data, extra probes
Adds:
- Improved documentation
- option to save Control / SNP probes
- GEO: download a bunch of data sets and only save samples that match a pattern
- GEO: load and use preprocessed data
- GEO: parse MiniML meta data into dataframe for analysis later
GEO/ArrayExpress data ingester, batch_size, and bug fixes
- the CLI now includes a
downloadoption. Supply the GEO ID or ArrayExpress ID and it will locate the files, download the idats, process them, and build a dataframe of the associated meta data. This dataframe format should be compatible with methylcheck and methylize. - When processing large batches of raw .idat files, specify --batch_size to break the processing up into smaller batches so the computer's memory won't overload. This is off by default when using
processbut is ON when usingdownloadand set to batch_size of 100. - Now includes support for older 27k arrays.
Functional processing 1.0
Provides a command line interface for processing methylation data (a batch of idat files and associated sample sheet csv).