A hybrid stacked ensemble and Kernel SHAP-based model for intelligent cardiotocography classification and interpretability

BMC Med Inform Decis Mak. 2023 Nov 28;23(1):273. doi: 10.1186/s12911-023-02378-y.

Abstract

Background: Intelligent cardiotocography (CTG) classification can assist obstetricians in evaluating fetal health. However, high classification performance is often achieved by complex machine learning (ML)-based models, which causes interpretability concerns. The trade-off between accuracy and interpretability makes it challenging for most existing ML-based CTG classification models to popularize in prenatal clinical applications.

Methods: Aiming to improve CTG classification performance and prediction interpretability, a hybrid model was proposed using a stacked ensemble strategy with mixed features and Kernel SHapley Additive exPlanations (SHAP) framework. Firstly, the stacked ensemble classifier was established by employing support vector machines (SVM), extreme gradient boosting (XGB), and random forests (RF) as base learners, and backpropagation (BP) as a meta learner whose input was mixed with the CTG features and the probability value of each category output by base learners. Then, the public and private CTG datasets were used to verify the discriminative performance. Furthermore, Kernel SHAP was applied to estimate the contribution values of features and their relationships to the fetal states.

Results: For intelligent CTG classification using 10-fold cross-validation, the accuracy and average F1 score were 0.9539 and 0.9249 in the public dataset, respectively; and those were 0.9201 and 0.8926 in the private dataset, respectively. For interpretability, the explanation results indicated that accelerations (AC) and the percentage of time with abnormal short-term variability (ASTV) were the key determinants. Specifically, the probability of abnormality increased and that of the normal state decreased as the value of ASTV grew. In addition, the likelihood of the normal status rose with the increase of AC.

Conclusions: The proposed model has high classification performance and reasonable interpretability for intelligent fetal monitoring.

Keywords: Cardiotocography; Fetal monitoring; Kernel SHAP; Machine learning; Stacked ensemble.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Cardiotocography* / methods
  • Cluster Analysis
  • Female
  • Humans
  • Machine Learning*
  • Pregnancy
  • Probability
  • Support Vector Machine