Machine Learning Improves Risk Stratification in Myelodysplastic Neoplasms: An Analysis of the Spanish Group of Myelodysplastic Syndromes

Hemasphere. 2023 Oct 11;7(10):e961. doi: 10.1097/HS9.0000000000000961. eCollection 2023 Oct.

Abstract

Myelodysplastic neoplasms (MDS) are a heterogeneous group of hematological stem cell disorders characterized by dysplasia, cytopenias, and increased risk of acute leukemia. As prognosis differs widely between patients, and treatment options vary from observation to allogeneic stem cell transplantation, accurate and precise disease risk prognostication is critical for decision making. With this aim, we retrieved registry data from MDS patients from 90 Spanish institutions. A total of 7202 patients were included, which were divided into a training (80%) and a test (20%) set. A machine learning technique (random survival forests) was used to model overall survival (OS) and leukemia-free survival (LFS). The optimal model was based on 8 variables (age, gender, hemoglobin, leukocyte count, platelet count, neutrophil percentage, bone marrow blast, and cytogenetic risk group). This model achieved high accuracy in predicting OS (c-indexes; 0.759 and 0.776) and LFS (c-indexes; 0.812 and 0.845). Importantly, the model was superior to the revised International Prognostic Scoring System (IPSS-R) and the age-adjusted IPSS-R. This difference persisted in different age ranges and in all evaluated disease subgroups. Finally, we validated our results in an external cohort, confirming the superiority of the Artificial Intelligence Prognostic Scoring System for MDS (AIPSS-MDS) over the IPSS-R, and achieving a similar performance as the molecular IPSS. In conclusion, the AIPSS-MDS score is a new prognostic model based exclusively on traditional clinical, hematological, and cytogenetic variables. AIPSS-MDS has a high prognostic accuracy in predicting survival in MDS patients, outperforming other well-established risk-scoring systems.