Comparing traditional modeling approaches versus predictive analytics methods for predicting multiple sclerosis relapse

K Walsh; R Shah; J K Armstrong; E S Moore; B J Oliver

doi:10.1016/j.msard.2021.103330

Comparing traditional modeling approaches versus predictive analytics methods for predicting multiple sclerosis relapse

Mult Scler Relat Disord. 2022 Jan:57:103330. doi: 10.1016/j.msard.2021.103330. Epub 2021 Oct 17.

Authors

K Walsh¹, R Shah², J K Armstrong², E S Moore³, B J Oliver⁴

Affiliations

¹ Jefferson College of Population Health, Philadelphia, PA, United States. Electronic address: karen.walsh@jefferson.edu.
² Jefferson College of Population Health, Philadelphia, PA, United States.
³ Department of Interprofessional Health & Aging Studies, University of Indianapolis, IN, United States.
⁴ Department of Community & Family Medicine, Geisel School of Medicine at Dartmouth and Dartmouth-Hitchcock-Health, Hanover, NH, Germany; The Dartmouth Institute for Health Policy & Clinical Practice, Geisel School of Medicine at Dartmouth, Hanover, NH, Germany; Department of Psychiatry, Geisel School of Medicine at Dartmouth, Hanover, NH, Germany.

PMID: 35158444
DOI: 10.1016/j.msard.2021.103330

Abstract

Objective: This study compared traditional statistical methods to different predictive analytics methods on the endpoint of multiple sclerosis (MS) relapse.

Study setting: This is a secondary data analysis on four different MS Centers based on the third year of data, July 2019-June 2020.

Study design: The parent study is a two-part, 3-year clinical quality improvement prospective study that started in June 2017 and concluded in June 2020, and utilizes a prospective stepped-wedge randomized design. Binary logistic regression was compared with other machine learning models, specifically ridge, least absolute shrinkage and selection operator (LASSO), and random forest.

Data collection: This study used electronic health record data extracted at the individual level and 'rolled up' to the system and population level. Inclusion criteria included participants aged 18 years or older, with MS presenting to any of the four centers, who entered the study in any quarter. Exclusion criteria included cases with missing or incorrectly input data and those who refused to participate in the study.

Principal findings: When comparing relapse indices across models, random forest significantly outperformed logistic regression and other machine learning algorithms (Δperf_A =27.1%, Δperf_M =27.5%). However, for Δperf_F, logistic regression and random forest performed relatively the same. Ridge and LASSO outperformed logistic regression (Δperf_M1 =0.9%, Δperf_M2 =9.4%, Δperf_F2=25.8%, respectively).

Conclusion: Multiple sclerosis is a complex and costly chronic ("3C") condition that currently has no cure. In a condition like MS, which has an unpredictable course, the use of predictive analytics could help health systems learn better, faster, and to improve more effectively and predict rather than react to emerging health needs for people with MS. Comparing the predictability of relapse across various models with a predictive analytics framework can potentially change how we manage MS care.

Keywords: Model comparison; Multiple sclerosis; Predictive analytics; Quality improvement; Relapse; Statistical modeling.

Publication types

Randomized Controlled Trial

MeSH terms

Adolescent
Humans
Logistic Models
Machine Learning
Multiple Sclerosis* / diagnosis
Multiple Sclerosis* / epidemiology
Multiple Sclerosis* / therapy
Prospective Studies
Recurrence