An interpretable AI model for recurrence prediction after surgery in gastrointestinal stromal tumour: an observational cohort study

Dimitris Bertsimas; Georgios Antonios Margonis; Seehanah Tang; Angelos Koulouras; Cristina R Antonescu; Murray F Brennan; Javier Martin-Broto; Piotr Rutkowski; Georgios Stasinos; Jane Wang; Emmanouil Pikoulis; Elzbieta Bylina; Pawel Sobczuk; Antonio Gutierrez; Bhumika Jadeja; William D Tap; Ping Chi; Samuel Singer

doi:10.1016/j.eclinm.2023.102200

An interpretable AI model for recurrence prediction after surgery in gastrointestinal stromal tumour: an observational cohort study

EClinicalMedicine. 2023 Sep 9:64:102200. doi: 10.1016/j.eclinm.2023.102200. eCollection 2023 Oct.

Authors

Dimitris Bertsimas¹, Georgios Antonios Margonis², Seehanah Tang¹, Angelos Koulouras¹, Cristina R Antonescu³, Murray F Brennan², Javier Martin-Broto^{4

5

6}, Piotr Rutkowski⁷, Georgios Stasinos⁸, Jane Wang⁹, Emmanouil Pikoulis¹⁰, Elzbieta Bylina⁷, Pawel Sobczuk⁷, Antonio Gutierrez^{4

5

6}, Bhumika Jadeja², William D Tap¹¹, Ping Chi^{11

12

13}, Samuel Singer²

Affiliations

¹ Operations Research Center, Massachusetts Institute of Technology, Cambridge, MA, USA.
² Department of Surgery, Memorial Sloan Kettering Cancer Center, New York, NY, USA.
³ Department of Pathology and Laboratory Medicine, Memorial Sloan Kettering Cancer Center, New York, NY, USA.
⁴ Medical Oncology Department, Fundación Jimenez Diaz University Hospital, Madrid, Spain.
⁵ Hospital General de Villalba, Madrid, Spain.
⁶ Instituto de Investigacion Sanitaria Fundacion Jimenez Diaz (IIS/FJD; UAM), Madrid, Spain.
⁷ Maria Sklodowska-Curie National Research Institute of Oncology, Warsaw, Poland.
⁸ Technical Chamber of Greece, Athens, Greece.
⁹ Department of Surgery, University of California San Francisco, San Francisco, CA, USA.
¹⁰ Third Department of Surgery, Attikon University Hospital, Athens, Greece.
¹¹ Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, NY, USA.
¹² Human Oncology and Pathogenesis Program (HOPP), Memorial Sloan Kettering Cancer Center, New York, NY, USA.
¹³ Department of Medicine, Weill Cornell Medical College, New York, NY, USA.

Abstract

Background: There are several models that predict the risk of recurrence following resection of localised, primary gastrointestinal stromal tumour (GIST). However, assessment of calibration is not always feasible and when performed, calibration of current GIST models appears to be suboptimal. We aimed to develop a prognostic model to predict the recurrence of GIST after surgery with both good discrimination and calibration by uncovering and harnessing the non-linear relationships among variables that predict recurrence.

Methods: In this observational cohort study, the data of 395 adult patients who underwent complete resection (R0 or R1) of a localised, primary GIST in the pre-imatinib era at Memorial Sloan Kettering Cancer Center (NY, USA) (recruited 1982-2001) and a European consortium (Spanish Group for Research in Sarcomas, 80 sites) (recruited 1987-2011) were used to train an interpretable Artificial Intelligence (AI)-based model called Optimal Classification Trees (OCT). The OCT predicted the probability of recurrence after surgery by capturing non-linear relationships among predictors of recurrence. The data of an additional 596 patients from another European consortium (Polish Clinical GIST Registry, 7 sites) (recruited 1981-2013) who were also treated in the pre-imatinib era were used to externally validate the OCT predictions with regard to discrimination (Harrell's C-index and Brier score) and calibration (calibration curve, Brier score, and Hosmer-Lemeshow test). The calibration of the Memorial Sloan Kettering (MSK) GIST nomogram was used as a comparative gold standard. We also evaluated the clinical utility of the OCT and the MSK nomogram by performing a Decision Curve Analysis (DCA).

Findings: The internal cohort included 395 patients (median [IQR] age, 63 [54-71] years; 214 men [54.2%]) and the external cohort included 556 patients (median [IQR] age, 60 [52-68] years; 308 men [55.4%]). The Harrell's C-index of the OCT in the external validation cohort was greater than that of the MSK nomogram (0.805 (95% CI: 0.803-0.808) vs 0.788 (95% CI: 0.786-0.791), respectively). In the external validation cohort, the slope and intercept of the calibration curve of the main OCT were 1.041 and 0.038, respectively. In comparison, the slope and intercept of the calibration curve for the MSK nomogram was 0.681 and 0.032, respectively. The MSK nomogram overestimated the recurrence risk throughout the entire calibration curve. Of note, the Brier score was lower for the OCT compared to the MSK nomogram (0.147 vs 0.564, respectively), and the Hosmer-Lemeshow test was insignificant (P = 0.087) for the OCT model but significant (P < 0.001) for the MSK nomogram. Both results confirmed the superior discrimination and calibration of the OCT over the MSK nomogram. A decision curve analysis showed that the AI-based OCT model allowed for superior decision making compared to the MSK nomogram for both patients with 25-50% recurrence risk as well as those with >50% risk of recurrence.

Interpretation: We present the first prognostic models of recurrence risk in GIST that demonstrate excellent discrimination, calibration, and clinical utility on external validation. Additional studies for further validation are warranted. With further validation, these tools could potentially improve patient counseling and selection for adjuvant therapy.

Funding: The NCI SPORE in Soft Tissue Sarcoma and NCI Cancer Center Support Grants.

Keywords: Artificial intelligence; GIST; Prognosis; Recurrence.

Grants and funding

P30 CA008748/CA/NCI NIH HHS/United States