Using blood routine indicators to establish a machine learning model for predicting liver fibrosis in patients with Schistosoma japonicum

Sci Rep. 2024 May 20;14(1):11485. doi: 10.1038/s41598-024-62521-1.

Abstract

This study intends to use the basic information and blood routine of schistosomiasis patients to establish a machine learning model for predicting liver fibrosis. We collected medical records of Schistosoma japonicum patients admitted to a hospital in China from June 2019 to June 2022. The method was to screen out the key variables and six different machine learning algorithms were used to establish prediction models. Finally, the optimal model was compared based on AUC, specificity, sensitivity and other indicators for further modeling. The interpretation of the model was shown by using the SHAP package. A total of 1049 patients' medical records were collected, and 10 key variables were screened for modeling using lasso method, including red cell distribution width-standard deviation (RDW-SD), Mean corpuscular hemoglobin concentration (MCHC), Mean corpuscular volume (MCV), hematocrit (HCT), Red blood cells, Eosinophils, Monocytes, Lymphocytes, Neutrophils, Age. Among the 6 different machine learning algorithms, LightGBM performed the best, and its AUCs in the training set and validation set were 1 and 0.818, respectively. This study established a machine learning model for predicting liver fibrosis in patients with Schistosoma japonicum. The model could help improve the early diagnosis and provide early intervention for schistosomiasis patients with liver fibrosis.

MeSH terms

  • Adult
  • Aged
  • Algorithms
  • Animals
  • China
  • Erythrocyte Indices
  • Female
  • Humans
  • Liver Cirrhosis* / blood
  • Liver Cirrhosis* / diagnosis
  • Liver Cirrhosis* / parasitology
  • Liver Cirrhosis* / pathology
  • Machine Learning*
  • Male
  • Middle Aged
  • Schistosoma japonicum*
  • Schistosomiasis japonica* / blood
  • Schistosomiasis japonica* / diagnosis