A machine learning-based approach for low-density lipoprotein cholesterol calculation using age, and lipid parameters

Clin Chim Acta. 2022 Oct 1:535:53-60. doi: 10.1016/j.cca.2022.08.007. Epub 2022 Aug 13.

Abstract

Background: Low-density lipoprotein cholesterol (LDL-C) is a critical biomarker for cardiovascular disease. However, no consensus exists on the best method for estimating LDL-C in Chinese laboratories. This study aimed to develop a machine learning (ML) method for LDL-C estimation.

Methods: An extensive data set of 111,448 samples were randomized into five equal subsets. ML-based equations were developed using age, sex, and lipid parameters based on five-fold cross-validation. The trained ML equations were externally validated in three different data sets. The performance of the ML equations was compared with the Friedewald, Martin/Hopkins, and Sampson equations.

Results: The selected ML equations showed less bias with direct LDL-C than other LDL-C equations in the Chinese population, including those with triglycerides (TG) ≥ 400 mg / dL and LDL-C < 40 mg / dL. The performance of the ML equations was less susceptible to age. External validation showed the generalization of the ML equations.

Conclusions: This study highlights the potential of integrating sex, age, and lipid parameters into the ML equations to obtain a more robust and reliable LDL-C calculation.

Keywords: Equation; Lipid; Low-density lipoprotein cholesterol; Machine learning.