Machine Learning Approximations to Predict Epigenetic Age Acceleration in Stroke Patients

Int J Mol Sci. 2023 Feb 1;24(3):2759. doi: 10.3390/ijms24032759.

Abstract

Age acceleration (Age-A) is a useful tool that is able to predict a broad range of health outcomes. It is necessary to determine DNA methylation levels to estimate it, and it is known that Age-A is influenced by environmental, lifestyle, and vascular risk factors (VRF). The aim of this study is to estimate the contribution of these easily measurable factors to Age-A in patients with cerebrovascular disease (CVD), using different machine learning (ML) approximations, and try to find a more accessible model able to predict Age-A. We studied a CVD cohort of 952 patients with information about VRF, lifestyle habits, and target organ damage. We estimated Age-A using Hannum's epigenetic clock, and trained six different models to predict Age-A: a conventional linear regression model, four ML models (elastic net regression (EN), K-Nearest neighbors, random forest, and support vector machine models), and one deep learning approximation (multilayer perceptron (MLP) model). The best-performing models were EN and MLP; although, the predictive capability was modest (R2 0.358 and 0.378, respectively). In conclusion, our results support the influence of these factors on Age-A; although, they were not enough to explain most of its variability.

Keywords: aging; epigenetic clock; machine learning; stroke; vascular risk factors.

MeSH terms

  • Cerebrovascular Disorders*
  • Epigenesis, Genetic
  • Humans
  • Machine Learning
  • Neural Networks, Computer
  • Stroke*