Interpretable machine learning for predicting pathologic complete response in patients treated with chemoradiation therapy for rectal adenocarcinoma

Du Wang; Sang Ho Lee; Huaizhi Geng; Haoyu Zhong; John Plastaras; Andrzej Wojcieszynski; Richard Caruana; Ying Xiao

doi:10.3389/frai.2022.1059033

Interpretable machine learning for predicting pathologic complete response in patients treated with chemoradiation therapy for rectal adenocarcinoma

Front Artif Intell. 2022 Dec 7:5:1059033. doi: 10.3389/frai.2022.1059033. eCollection 2022.

Authors

Du Wang¹, Sang Ho Lee¹, Huaizhi Geng¹, Haoyu Zhong¹, John Plastaras¹, Andrzej Wojcieszynski¹, Richard Caruana², Ying Xiao¹

Affiliations

¹ Department of Radiation Oncology, University of Pennsylvania, Philadelphia, PA, United States.
² Microsoft Research, Redmond, WA, United States.

Abstract

Purpose: Pathologic complete response (pCR) is a critical factor in determining whether patients with rectal cancer (RC) should have surgery after neoadjuvant chemoradiotherapy (nCRT). Currently, a pathologist's histological analysis of surgical specimens is necessary for a reliable assessment of pCR. Machine learning (ML) algorithms have the potential to be a non-invasive way for identifying appropriate candidates for non-operative therapy. However, these ML models' interpretability remains challenging. We propose using explainable boosting machine (EBM) to predict the pCR of RC patients following nCRT.

Methods: A total of 296 features were extracted, including clinical parameters (CPs), dose-volume histogram (DVH) parameters from gross tumor volume (GTV) and organs-at-risk, and radiomics (R) and dosiomics (D) features from GTV. R and D features were subcategorized into shape (S), first-order (L1), second-order (L2), and higher-order (L3) local texture features. Multi-view analysis was employed to determine the best set of input feature categories. Boruta was used to select all-relevant features for each input dataset. ML models were trained on 180 cases from our institution, with 37 cases from RTOG 0822 clinical trial serving as the independent dataset for model validation. The performance of EBM in predicting pCR on the test dataset was evaluated using ROC AUC and compared with that of three state-of-the-art black-box models: extreme gradient boosting (XGB), random forest (RF) and support vector machine (SVM). The predictions of all black-box models were interpreted using Shapley additive explanations.

Results: The best input feature categories were CP+DVH+S+R_L1+R_L2 for all models, from which Boruta-selected features enabled the EBM, XGB, RF, and SVM models to attain the AUCs of 0.820, 0.828, 0.828, and 0.774, respectively. Although EBM did not achieve the best performance, it provided the best capability for identifying critical turning points in response scores at distinct feature values, revealing that the bladder with maximum dose >50 Gy, and the tumor with maximum2DDiameterColumn >80 mm, elongation <0.55, leastAxisLength >50 mm and lower variance of CT intensities were associated with unfavorable outcomes.

Conclusions: EBM has the potential to enhance the physician's ability to evaluate an ML-based prediction of pCR and has implications for selecting patients for a "watchful waiting" strategy to RC therapy.

Keywords: clinical image processing; dosiomics; interpretable machine learning; multi-view input data analysis; pathologic complete response; radiomics; rectal cancer.

Grants and funding

S10 OD023495/OD/NIH HHS/United States