Application of Data Mining Technology in the Screening for Gallbladder Stones: A Cross-Sectional Retrospective Study of Chinese Adults

Yonsei Med J. 2024 Apr;65(4):210-216. doi: 10.3349/ymj.2023.0246.

Abstract

Purpose: The purpose of this study was to use data mining methods to establish a simple and reliable predictive model based on the risk factors related to gallbladder stones (GS) to assist in their diagnosis and reduce medical costs.

Materials and methods: This was a retrospective cross-sectional study. A total of 4215 participants underwent annual health examinations between January 2019 and December 2019 at the Physical Examination Center of Shengjing Hospital Affiliated to China Medical University. After rigorous data screening, the records of 2105 medical examiners were included for the construction of J48, multilayer perceptron (MLP), Bayes Net, and Naïve Bayes algorithms. A ten-fold cross-validation method was used to verify the recognition model and determine the best classification algorithm for GS.

Results: The performance of these models was evaluated using metrics of accuracy, precision, recall, F-measure, and area under the receiver operating characteristic curve. Comparison of the F-measure for each algorithm revealed that the F-measure values for MLP and J48 (0.867 and 0.858, respectively) were not statistically significantly different (p>0.05), although they were significantly higher than the F-measure values for Bayes Net and Naïve Bayes (0.824 and 0.831, respectively; p<0.05).

Conclusion: The results of this study showed that MLP and J48 algorithms are effective at screening individuals for the risk of GS. The key attributes of data mining can further promote the prevention of GS through targeted community intervention, improve the outcome of GS, and reduce the burden on the medical system.

Keywords: Gallbladder stones; data mining; decision tree; logistic regression model; neural network model.

MeSH terms

  • Adult
  • Algorithms*
  • Bayes Theorem
  • Cross-Sectional Studies
  • Data Mining / methods
  • Gallbladder*
  • Humans
  • Retrospective Studies