An accurate and explainable ensemble learning method for carotid plaque prediction in an asymptomatic population

Comput Methods Programs Biomed. 2022 Jun:221:106842. doi: 10.1016/j.cmpb.2022.106842. Epub 2022 Apr 28.

Abstract

Background and objective: The identification of carotid plaque, one of the most crucial tasks in stroke screening, is of great significance in the assessment of subclinical atherosclerosis and preventing the onset of stroke. However, traditional ultrasound examination is not prevalent or cost-effective for asymptomatic people, particularly low-income individuals in rural areas. Thus, it is necessary to develop an accurate and explainable model for early identification of the risk of plaque prevalence that can help in the primary prevention of stroke.

Methods: We developed an ensemble learning method to predict the occurrence of carotid plaques. A dataset comprising 1440 subjects (50% with plaques and 50% without plaques) and ten-fold cross-validation were utilized to evaluate the model performance. Four machine learning methods (extreme gradient boosting (XGBoost), gradient boosting decision tree, random forest, and support vector machine) were evaluated. Subsequently, the interpretability of the XGBoost model, which provided the best performance, was analyzed from three aspects: feature importance, feature effect on prediction model, and feature effect on prediction decision for a specific subject.

Results: The XGBoost algorithm provided the best performance (sensitivity: 0.8678, specificity: 0.8592, accuracy: 0.8632, F1 score: 0.8621, area under the curve: 0.8635) in carotid plaque prediction and also had excellent performance under missing data circumstances. Further, interpretability analysis showed that the decisions of the XGBoost model were highly congruent with clinical knowledge.

Conclusion: The model results are superior to those of state-of-the-art methods. Thus, it is a promising carotid plaque prediction tool that could be used in the primary prevention of stroke.

Keywords: Carotid plaque; Explainable model; Machine learning; Prediction; Primary prevention of stroke.

MeSH terms

  • Carotid Arteries / diagnostic imaging
  • Humans
  • Machine Learning
  • Plaque, Atherosclerotic* / diagnostic imaging
  • Stroke* / diagnostic imaging
  • Support Vector Machine