Predicting the conversion from clinically isolated syndrome to multiple sclerosis: An explainable machine learning approach

Mult Scler Relat Disord. 2024 Jun:86:105614. doi: 10.1016/j.msard.2024.105614. Epub 2024 Apr 9.

Abstract

Introduction: Predicting the conversion of clinically isolated syndrome (CIS) to clinically definite multiple sclerosis (CDMS) is critical to personalizing treatment planning and benefits for patients. The aim of this study is to develop an explainable machine learning (ML) model for predicting this conversion based on demographic, clinical, and imaging data.

Method: The ML model, Extreme Gradient Boosting (XGBoost), was employed on the public dataset of 273 Mexican mestizo CIS patients with 10-year follow-up. The data was divided into a training set for cross-validation and feature selection, and a holdout test set for final testing. Feature importance was determined using the SHapley Additive Explanations library (SHAP). Then, two experiments were conducted to optimize the model's performance by selectively adding variables and selecting the most contributive variables for the final model.

Results: Nine variables including age, gender, schooling, motor symptoms, infratentorial and periventricular lesion at imaging, oligoclonal band in cerebrospinal fluid, lesion and symptoms types were significant. The model achieved an accuracy of 83.6 %, AUC of 91.8 %, sensitivity of 83.9 %, and specificity of 83.4 % in cross-validation. In the final testing, the model achieved an accuracy of 78.3 %, AUC of 85.8 %, sensitivity of 75 %, and specificity of 81.1 %. Finally, a web-based demo of the model was created for testing purposes.

Conclusion: The model, focusing on feature selection and interpretability, effectively stratifies risk for treatment decisions and disability prevention in MS patients. It provides a numerical risk estimate for CDMS conversion, enhancing transparency in clinical decision-making and aiding in patient care.

Keywords: Clinically isolated syndrome; Explainability; Machine learning; Model; Multiple sclerosis; Prediction; XGBoost.

MeSH terms

  • Adult
  • Demyelinating Diseases* / diagnosis
  • Demyelinating Diseases* / diagnostic imaging
  • Disease Progression*
  • Female
  • Follow-Up Studies
  • Humans
  • Machine Learning*
  • Magnetic Resonance Imaging
  • Male
  • Mexico
  • Middle Aged
  • Multiple Sclerosis* / diagnosis