Development of a Machine Learning-Based Predictive Model for Lung Metastasis in Patients With Ewing Sarcoma

Front Med (Lausanne). 2022 Apr 1:9:807382. doi: 10.3389/fmed.2022.807382. eCollection 2022.

Abstract

Background: This study aimed to develop and validate machine learning (ML)-based prediction models for lung metastasis (LM) in patients with Ewing sarcoma (ES), and to deploy the best model as an open access web tool.

Methods: We retrospectively analyzed data from the Surveillance Epidemiology and End Results (SEER) Database from 2010 to 2016 and from four medical institutions to develop and validate predictive models for LM in patients with ES. Patient data from the SEER database was used as the training group (n = 929). Using demographic and clinicopathologic variables six ML-based models for predicting LM were developed, and internally validated using 10-fold cross validation. All ML-based models were subsequently externally validated using multiple data from four medical institutions (the validation group, n = 51). The predictive power of the models was evaluated by the area under receiver operating characteristic curve (AUC). The best-performing model was used to produce an online tool for use by clinicians to identify ES patients at risk from lung metastasis, to improve decision making and optimize individual treatment.

Results: The study cohort consisted of 929 patients from the SEER database and 51 patients from multiple medical centers, a total of 980 ES patients. Of these, 175 (18.8%) had lung metastasis. Multivariate logistic regression analysis was performed with survival time, T-stage, N-stage, surgery, and bone metastasis providing the independent predictive factors of LM. The AUC value of six predictive models ranged from 0.585 to 0.705. The Random Forest (RF) model (AUC = 0.705) using 4 variables was identified as the best predictive model of LM in ES patients and was employed to construct an online tool to assist clinicians in optimizing patient treatment. (https://share.streamlit.io/liuwencai123/es_lm/main/es_lm.py).

Conclusions: Machine learning were found to have utility for predicting LM in patients with Ewing sarcoma, and the RF model gave the best performance. The accessibility of the predictive model as a web-based tool offers clear opportunities for improving the personalized treatment of patients with ES.

Keywords: Ewing sarcoma; lung metastasis; machine learning algorithms; multicenter; web calculator.