We developed an interpretable blood-based epigenetic clock to estimate DNA methylation age and identify disease-specific DNA methylation alterations. Using 8,233 Illumina methylomes from healthy controls and nine age-associated diseases, we used ridge selection to retain 4,855 CpG sites and benchmarked 20 regression architectures with Bayesian hyperparameter optimization. Tree-based boosting models performed best, and a weighted ensemble achieved a mean absolute error of 2.54 years on held-out test data, outperforming established clocks evaluated under the same conditions. External validation in the independent EPIC cohort GSE132203 provided initial cross-platform support for transferability (MAE 3.04 years) and a modest but significant association with GEO-reported age acceleration (r = 0.145, p = 0.041). PaCMAP embeddings revealed structured aging trajectories and disease-enriched neighborhoods. Model interpretation highlighted loci including ELOVL2, FHL2, KLF14, CD8A, LAG3, SMAD2, and NSD1, while Enrichr identified enrichment of REST, microRNA, and histone modification programs. The clock identified departures from healthy aging across diseases with the greatest acceleration observed in stroke patients.
rajarshi-mandal/epigenetic-clock
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|