Skip to content

dmeth: Differential Methylation Analysis Toolkit

A fast, statistically rigorous Python framework providing a toolkit for DNA methylation analysis - from raw beta matrices to biomarkers and functional interpretation. dmeth implements the full modern differential methylation pipeline used in high-impact epigenome-wide association studies (EWAS), with performance and correctness on par with established R/bioconductor tools, all in pure Python.

Key Features

Feature Implementation Performance
Empirical Bayes moderated t-tests limma-style (Smyth 2004) with exact replication Numba-accelerated (10–100× faster)
Memory-efficient chunked analysis Automatic fallback for >1M probes <4 GB RAM typical
Cell-type deconvolution Reference-based NNLS (Houseman/Horvath-style) Parallel joblib
DMR discovery Sliding-window clustering + gap merging Vectorized
Gene annotation & pathway enrichment IntervalTree + Fisher’s exact (FDR) Sub-second on 450k/EPIC
Coordinate liftover (hg19 ↔ hg38) pyliftover integration Per-region tracking
Biomarker panel discovery & validation RF / Elastic Net + stratified CV Built-in
Robust preprocessing & QC Missingness, group representation, imputation Production-safe

Fully supports Illumina 450K, EPIC (850K), and any custom CpG × sample matrix.

Citation

If you use dmeth in your research, please cite:

@software{dmeth2025,
  author = {Afolabi, Dare},
  title = {dmeth: A comprehensive Python toolkit for differential DNA methylation analysis with empirical Bayes moderation and biomarker discovery},
  version = {0.2.0},
  year = {2025},
  publisher = {GitHub},
  doi = {10.5281/zenodo.17777501},
  url = {https://doi.org/10.5281/zenodo.17777501},
}

References

  • Smyth, G. K. (2004). Linear models and empirical bayes methods for assessing differential expression in microarray experiments. Statistical Applications in Genetics and Molecular Biology, 3(1).
  • Liu, P., & Hwang, J.T.G. (2007). Quick calculation for sample size while controlling false discovery rate with application to microarray analysis. Bioinformatics, 23(6), 739–746.
  • Du, P., Zhang, X., Huang, C.-C., Jafari, N., Kibbe, W.A., Hou, L., & Lin, S. (2010). Comparison of Beta-value and M-value methods for quantifying methylation levels by microarray analysis. BMC Bioinformatics, 11:587.
  • Jung, S.H., Young, S.S. (2012). Power and sample size calculation for microarray studies. Journal of Biopharmaceutical Statistics, 22(1):30-42.
  • Phipson, B. et al. (2016). missMethyl: an R package for analyzing data from Illumina’s HumanMethylation450 platform. Bioinformatics, 32(2), 286-288.

Support