A machine learning-based approach for low-density lipoprotein cholesterol calculation using age, and lipid parameters

Gaowei Fan; Shunli Zhang; Qisheng Wu; Yan Song; Anqi Jia; Di Li; Yuhong Yue; Qingtao Wang

doi:10.1016/j.cca.2022.08.007

A machine learning-based approach for low-density lipoprotein cholesterol calculation using age, and lipid parameters

Clin Chim Acta. 2022 Oct 1:535:53-60. doi: 10.1016/j.cca.2022.08.007. Epub 2022 Aug 13.

Authors

Gaowei Fan¹, Shunli Zhang¹, Qisheng Wu², Yan Song³, Anqi Jia¹, Di Li⁴, Yuhong Yue¹, Qingtao Wang⁵

Affiliations

¹ Department of Clinical Laboratory, Beijing Chao-Yang Hospital, Capital Medical University, Beijing, China.
² Division of Pathology & Laboratory Medicine, Lu Daopei Hospital, Beijing, China.
³ Department of Clinical Laboratory, Beijing Shangdi Hospital, Beijing, China.
⁴ Laboratory of Clinical Microbiology and Infectious Diseases, Department of Pulmonary and Critical Care Medicine, China-Japan Friendship Hospital, Beijing, China.
⁵ Department of Clinical Laboratory, Beijing Chao-Yang Hospital, Capital Medical University, Beijing, China. Electronic address: wqt36@163.com.

PMID: 35970405
DOI: 10.1016/j.cca.2022.08.007

Abstract

Background: Low-density lipoprotein cholesterol (LDL-C) is a critical biomarker for cardiovascular disease. However, no consensus exists on the best method for estimating LDL-C in Chinese laboratories. This study aimed to develop a machine learning (ML) method for LDL-C estimation.

Methods: An extensive data set of 111,448 samples were randomized into five equal subsets. ML-based equations were developed using age, sex, and lipid parameters based on five-fold cross-validation. The trained ML equations were externally validated in three different data sets. The performance of the ML equations was compared with the Friedewald, Martin/Hopkins, and Sampson equations.

Results: The selected ML equations showed less bias with direct LDL-C than other LDL-C equations in the Chinese population, including those with triglycerides (TG) ≥ 400 mg / dL and LDL-C < 40 mg / dL. The performance of the ML equations was less susceptible to age. External validation showed the generalization of the ML equations.

Conclusions: This study highlights the potential of integrating sex, age, and lipid parameters into the ML equations to obtain a more robust and reliable LDL-C calculation.

Keywords: Equation; Lipid; Low-density lipoprotein cholesterol; Machine learning.