Machine learning models with time-series clinical features to predict radiographic progression in patients with ankylosing spondylitis

J Rheum Dis. 2024 Apr 1;31(2):97-107. doi: 10.4078/jrd.2023.0056. Epub 2023 Dec 20.

Abstract

Objective: Ankylosing spondylitis (AS) is chronic inflammatory arthritis causing structural damage and radiographic progression to the spine due to repeated and continuous inflammation over a long period. This study establishes the application of machine learning models to predict radiographic progression in AS patients using time-series data from electronic medical records (EMRs).

Methods: EMR data, including baseline characteristics, laboratory findings, drug administration, and modified Stoke AS Spine Score (mSASSS), were collected from 1,123 AS patients between January 2001 and December 2018 at a single center at the time of first (T1), second (T2), and third (T3) visits. The radiographic progression of the (n+1)th visit (Pn+1=(mSASSSn+1-mSASSSn)/(Tn+1-Tn)≥1 unit per year) was predicted using follow-up visit datasets from T1 to Tn. We used three machine learning methods (logistic regression with the least absolute shrinkage and selection operation, random forest, and extreme gradient boosting algorithms) with three-fold cross-validation.

Results: The random forest model using the T1 EMR dataset best predicted the radiographic progression P2 among the machine learning models tested with a mean accuracy and area under the curves of 73.73% and 0.79, respectively. Among the T1 variables, the most important variables for predicting radiographic progression were in the order of total mSASSS, age, and alkaline phosphatase.

Conclusion: Prognosis predictive models using time-series data showed reasonable performance with clinical features of the first visit dataset when predicting radiographic progression.

Keywords: Ankylosing spondylitis; Disease progression; Machine learning.