Feature Importance Analysis of a Deep Learning Model for Predicting Late Bladder Toxicity Occurrence in Uterine Cervical Cancer Patients

Wonjoong Cheon; Mira Han; Seonghoon Jeong; Eun Sang Oh; Sung Uk Lee; Se Byeong Lee; Dongho Shin; Young Kyung Lim; Jong Hwi Jeong; Haksoo Kim; Joo Young Kim

doi:10.3390/cancers15133463

Feature Importance Analysis of a Deep Learning Model for Predicting Late Bladder Toxicity Occurrence in Uterine Cervical Cancer Patients

Cancers (Basel). 2023 Jul 2;15(13):3463. doi: 10.3390/cancers15133463.

Authors

Affiliations

¹ Proton Therapy Center, National Cancer Center, Goyang-si 10408, Republic of Korea.
² Biostatistics Collaboration Team, National Cancer Center, Goyang-si 10408, Republic of Korea.

Abstract

(1) In this study, we developed a deep learning (DL) model that can be used to predict late bladder toxicity. (2) We collected data obtained from 281 uterine cervical cancer patients who underwent definitive radiation therapy. The DL model was trained using 16 features, including patient, tumor, treatment, and dose parameters, and its performance was compared with that of a multivariable logistic regression model using the following metrics: accuracy, prediction, recall, F1-score, and area under the receiver operating characteristic curve (AUROC). In addition, permutation feature importance was calculated to interpret the DL model for each feature, and the lightweight DL model was designed to focus on the top five important features. (3) The DL model outperformed the multivariable logistic regression model on our dataset. It achieved an F1-score of 0.76 and an AUROC of 0.81, while the corresponding values for the multivariable logistic regression were 0.14 and 0.43, respectively. The DL model identified the doses for the most exposed 2 cc volume of the bladder (BD_2cc) as the most important feature, followed by BD_5cc and the ICRU bladder point. In the case of the lightweight DL model, the F-score and AUROC were 0.90 and 0.91, respectively. (4) The DL models exhibited superior performance in predicting late bladder toxicity compared with the statistical method. Through the interpretation of the model, it further emphasized its potential for improving patient outcomes and minimizing treatment-related complications with a high level of reliability.

Keywords: deep learning; feature importance; interpretable artificial intelligence; toxicity prediction; uterine cervical cancer.

Grants and funding

2110610-3/National Cancer Center/Republic of Korea