Machine-Learning Algorithm for Predicting Fatty Liver Disease in a Taiwanese Population

Yang-Yuan Chen; Chun-Yu Lin; Hsu-Heng Yen; Pei-Yuan Su; Ya-Huei Zeng; Siou-Ping Huang; I-Ling Liu

doi:10.3390/jpm12071026

Machine-Learning Algorithm for Predicting Fatty Liver Disease in a Taiwanese Population

J Pers Med. 2022 Jun 23;12(7):1026. doi: 10.3390/jpm12071026.

Authors

Yang-Yuan Chen^{1

2}, Chun-Yu Lin^{3

4}, Hsu-Heng Yen^{1

5

6

7

8}, Pei-Yuan Su¹, Ya-Huei Zeng¹, Siou-Ping Huang¹, I-Ling Liu¹

Affiliations

¹ Department of Internal Medicine, Division of Gastroenterology, Changhua Christian Hospital, Changhua 500, Taiwan.
² Department of Hospitality Management, MingDao University, Changhua 500, Taiwan.
³ Department of Family Medicine, Asia University Hospital, Taichung 400, Taiwan.
⁴ Chun-Shen Clinic, Nantou 540, Taiwan.
⁵ General Education Center, Chienkuo Technology University, Changhua 500, Taiwan.
⁶ Department of Electrical Engineering, Chung Yuan Christian University, Taoyuan 320, Taiwan.
⁷ Artificial Intelligence Development Center, Changhua Christian Hospital, Changhua 500, Taiwan.
⁸ College of Medicine, National Chung Hsing University, Taichung 400, Taiwan.

Abstract

The rising incidence of fatty liver disease (FLD) poses a health challenge, and is expected to be the leading global cause of liver-related morbidity and mortality in the near future. Early case identification is crucial for disease intervention. A retrospective cross-sectional study was performed on 31,930 Taiwanese subjects (25,544 training and 6386 testing sets) who had received health check-ups and abdominal ultrasounds in Changhua Christian Hospital from January 2009 to January 2019. Clinical and laboratory factors were included for analysis by different machine-learning algorithms. In addition, the performance of the machine-learning algorithms was compared with that of the fatty liver index (FLI). Totally, 6658/25,544 (26.1%) and 1647/6386 (25.8%) subjects had moderate-to-severe liver disease in the training and testing sets, respectively. Five machine-learning models were examined and demonstrated exemplary performance in predicting FLD. Among these models, the xgBoost model revealed the highest area under the receiver operating characteristic (AUROC) (0.882), accuracy (0.833), F1 score (0.829), sensitivity (0.833), and specificity (0.683) compared with those of neural network, logistic regression, random forest, and support vector machine-learning models. The xgBoost, neural network, and logistic regression models had a significantly higher AUROC than that of FLI. Body mass index was the most important feature to predict FLD according to the feature ranking scores. The xgBoost model had the best overall prediction ability for diagnosing FLD in our study. Machine-learning algorithms provide considerable benefits for screening candidates with FLD.

Keywords: fatty liver disease; machine learning; predicting.

Grants and funding

109-CCH-IRP-008 and 111-CCH-IRP-011/Changhua Christian Hospital