Data processing choices can affect findings in differential methylation analyses: an investigation using data from the LIMIT RCT

PeerJ. 2023 Feb 3:11:e14786. doi: 10.7717/peerj.14786. eCollection 2023.

Abstract

Objective: A wide array of methods exist for processing and analysing DNA methylation data. We aimed to perform a systematic comparison of the behaviour of these methods, using cord blood DNAm from the LIMIT RCT, in relation to detecting hypothesised effects of interest (intervention and pre-pregnancy maternal BMI) as well as effects known to be spurious, and known to be present.

Methods: DNAm data, from 645 cord blood samples analysed using Illumina 450K BeadChip arrays, were normalised using three different methods (with probe filtering undertaken pre- or post- normalisation). Batch effects were handled with a supervised algorithm, an unsupervised algorithm, or adjustment in the analysis model. Analysis was undertaken with and without adjustment for estimated cell type proportions. The effects estimated included intervention and BMI (effects of interest in the original study), infant sex and randomly assigned groups. Data processing and analysis methods were compared in relation to number and identity of differentially methylated probes, rankings of probes by p value and log-fold-change, and distributions of p values and log-fold-change estimates.

Results: There were differences corresponding to each of the processing and analysis choices. Importantly, some combinations of data processing choices resulted in a substantial number of spurious 'significant' findings. We recommend greater emphasis on replication and greater use of sensitivity analyses.

Keywords: Bioinformatics; DNA methylation; Differential methylation; Reproducibility.

Publication types

  • Randomized Controlled Trial
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • DNA Methylation*
  • Family
  • Fetal Blood
  • Humans
  • Infant

Grants and funding

The LIMIT Randomised Trial was funded by an NHMRC grant (ID519240), awarded to Jodie M. Dodd. Funding for the DNA methylation analysis was from the Commission of the European Communities, the 7th Framework Programme, contract FP7-289346-EARLY NUTRITION. Jodie M. Dodd was also supported by NHMRC Practitioner Fellowships (ID627005 and ID1078980) and Investigator Grant (ID1196133). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.