Machine-learning-based models to predict cardiovascular risk using oculomics and clinic variables in KNHANES

BioData Min. 2024 Apr 22;17(1):12. doi: 10.1186/s13040-024-00363-3.

Abstract

Background: Recent researches have found a strong correlation between the triglyceride-glucose (TyG) index or the atherogenic index of plasma (AIP) and cardiovascular disease (CVD) risk. However, there is a lack of research on non-invasive and rapid prediction of cardiovascular risk. We aimed to develop and validate a machine-learning model for predicting cardiovascular risk based on variables encompassing clinical questionnaires and oculomics.

Methods: We collected data from the Korean National Health and Nutrition Examination Survey (KNHANES). The training dataset (80% from the year 2008 to 2011 KNHANES) was used for machine learning model development, with internal validation using the remaining 20%. An external validation dataset from the year 2012 assessed the model's predictive capacity for TyG-index or AIP in new cases. We included 32122 participants in the final dataset. Machine learning models used 25 algorithms were trained on oculomics measurements and clinical questionnaires to predict the range of TyG-index and AIP. The area under the receiver operating characteristic curve (AUC), accuracy, precision, recall, and F1 score were used to evaluate the performance of our machine learning models.

Results: Based on large-scale cohort studies, we determined TyG-index cut-off points at 8.0, 8.75 (upper one-third values), 8.93 (upper one-fourth values), and AIP cut-offs at 0.318, 0.34. Values surpassing these thresholds indicated elevated cardiovascular risk. The best-performing algorithm revealed TyG-index cut-offs at 8.0, 8.75, and 8.93 with internal validation AUCs of 0.812, 0.873, and 0.911, respectively. External validation AUCs were 0.809, 0.863, and 0.901. For AIP at 0.34, internal and external validation achieved similar AUCs of 0.849 and 0.842. Slightly lower performance was seen for the 0.318 cut-off, with AUCs of 0.844 and 0.836. Significant gender-based variations were noted for TyG-index at 8 (male AUC=0.832, female AUC=0.790) and 8.75 (male AUC=0.874, female AUC=0.862) and AIP at 0.318 (male AUC=0.853, female AUC=0.825) and 0.34 (male AUC=0.858, female AUC=0.831). Gender similarity in AUC (male AUC=0.907 versus female AUC=0.906) was observed only when the TyG-index cut-off point equals 8.93.

Conclusion: We have established a simple and effective non-invasive machine learning model that has good clinical value for predicting cardiovascular risk in the general population.

Keywords: Atherogenic index of plasma; Cardiovascular risk; Machine learning; Oculomics; Triglyceride-glucose index.