Prediction of hepatocellular carcinoma risk in patients with type-2 diabetes using supervised machine learning classification model

Heliyon. 2022 Sep 29;8(10):e10772. doi: 10.1016/j.heliyon.2022.e10772. eCollection 2022 Oct.

Abstract

Background: Hepatocellular carcinoma (HCC) among type-2 diabetes (T2D) patients is an increasing burden to diabetes management. This study aims to develop and select the best machine learning (ML) classification model for predicting HCC in T2D for HCC early detection.

Methods: A case-control study was conducted utilising computerised medical records in two hepatobiliary centres. The predictors were chosen using multiple logistic regression. IBM SPSS Modeler® was used to assess the discriminative performance of support vector machine (SVM), logistic regression (LR), artificial neural network (ANN), chi-square automatic interaction detection (CHAID), and their ensembles.

Results: Subjects (N = 424) were split into 60% training (n = 248) and 40% testing (n = 176) groups. The independent predictors identified were race, viral hepatitis, abdominal pain/discomfort, unintentional weight loss, statins, alcohol consumption, non-alcoholic fatty liver, platelet <150 ×103/μL, alkaline phosphatase >129 IU/L, and alanine transaminase ≥25 IU/L. The performances of all models differed significantly (Cochran's Q-test,p = 0.001) but not between the ensembled and SVM model (McNemar test, p = 0.687). SVM model was selected as the best model due to its simplicity, high accuracy (85.28%), and high AUC (0.914). A web-based application was developed using the best model's algorithm for HCC prediction.

Conclusions: If further validation studies confirm these results, the SVM model's application potentially augments early HCC detection in T2D patients.

Keywords: Diabetes; Hepatocellular carcinoma; Machine learning; Risk prediction; Support vector machine.