Aim: We examined whether variation in blood-based epigenome-wide association studies could be more completely explained by augmenting existing reference DNA methylation libraries.
Materials & methods: We compared existing and enhanced libraries in predicting variability in three publicly available 450K methylation datasets that collected whole-blood samples. Models were fit separately to each CpG site and used to estimate the additional variability when adjustments for cell composition were made with each library.
Results: Calculation of the mean difference in the CpG-specific residual sums of squares error between models for an arthritis, aging and metabolic syndrome dataset, indicated that an enhanced library explained significantly more variation across all three datasets (p < 10(-3)).
Conclusion: Pathologically important immune cell subtypes can explain important variability in epigenome-wide association studies done in blood.
Keywords: 450K methylation library; DNA methylation; aging; arthritis; cell mixture deconvolution; cellular heterogeneity; confounding; differentially methylated regions; epigenome-wide association study; inflammation; lymphocytes.