Evaluation of three statistical prediction models for forensic age prediction based on DNA methylation

Forensic Sci Int Genet. 2018 May:34:128-133. doi: 10.1016/j.fsigen.2018.02.008. Epub 2018 Feb 9.

Abstract

DNA methylation is a promising biomarker for forensic age prediction. A challenge that has emerged in recent studies is the fact that prediction errors become larger with increasing age due to interindividual differences in epigenetic ageing rates. This phenomenon of non-constant variance or heteroscedasticity violates an assumption of the often used method of ordinary least squares (OLS) regression. The aim of this study was to evaluate alternative statistical methods that do take heteroscedasticity into account in order to provide more accurate, age-dependent prediction intervals. A weighted least squares (WLS) regression is proposed as well as a quantile regression model. Their performances were compared against an OLS regression model based on the same dataset. Both models provided age-dependent prediction intervals which account for the increasing variance with age, but WLS regression performed better in terms of success rate in the current dataset. However, quantile regression might be a preferred method when dealing with a variance that is not only non-constant, but also not normally distributed. Ultimately the choice of which model to use should depend on the observed characteristics of the data.

Keywords: DNA methylation; Forensic age prediction; Statistical regression modelling.

MeSH terms

  • Aging / genetics*
  • CpG Islands / genetics
  • DNA Methylation*
  • Epigenomics
  • Forensic Genetics / methods
  • Genetic Markers
  • Humans
  • Models, Statistical*
  • Sequence Analysis, DNA / methods

Substances

  • Genetic Markers