Background: Low-density lipoprotein cholesterol (LDL-C) is a critical biomarker for cardiovascular disease. However, no consensus exists on the best method for estimating LDL-C in Chinese laboratories. This study aimed to develop a machine learning (ML) method for LDL-C estimation.
Methods: An extensive data set of 111,448 samples were randomized into five equal subsets. ML-based equations were developed using age, sex, and lipid parameters based on five-fold cross-validation. The trained ML equations were externally validated in three different data sets. The performance of the ML equations was compared with the Friedewald, Martin/Hopkins, and Sampson equations.
Results: The selected ML equations showed less bias with direct LDL-C than other LDL-C equations in the Chinese population, including those with triglycerides (TG) ≥ 400 mg / dL and LDL-C < 40 mg / dL. The performance of the ML equations was less susceptible to age. External validation showed the generalization of the ML equations.
Conclusions: This study highlights the potential of integrating sex, age, and lipid parameters into the ML equations to obtain a more robust and reliable LDL-C calculation.
Keywords: Equation; Lipid; Low-density lipoprotein cholesterol; Machine learning.
Copyright © 2022. Published by Elsevier B.V.