Improving radiomic model reliability using robust features from perturbations for head-and-neck carcinoma

Xinzhi Teng; Jiang Zhang; Zongrui Ma; Yuanpeng Zhang; Saikit Lam; Wen Li; Haonan Xiao; Tian Li; Bing Li; Ta Zhou; Ge Ren; Francis Kar-Ho Lee; Kwok-Hung Au; Victor Ho-Fun Lee; Amy Tien Yee Chang; Jing Cai

doi:10.3389/fonc.2022.974467

Improving radiomic model reliability using robust features from perturbations for head-and-neck carcinoma

Front Oncol. 2022 Oct 14:12:974467. doi: 10.3389/fonc.2022.974467. eCollection 2022.

Authors

Affiliations

¹ Department of Health Technology and Informatics, The Hong Kong Polytechnic University, Hong Kong, Hong Kong SAR, China.
² Department of Clinical Oncology, Queen Elizabeth Hospital, Hong Kong, Hong Kong SAR, China.
³ Department of Clinical Oncology, The University of Hong Kong, Hong Kong, Hong Kong SAR, China.
⁴ Comprehensive Oncology Centre, Hong Kong Sanatorium and Hospital, Hong Kong, Hong Kong SAR, China.

Abstract

Background: Using high robust radiomic features in modeling is recommended, yet its impact on radiomic model is unclear. This study evaluated the radiomic model's robustness and generalizability after screening out low-robust features before radiomic modeling. The results were validated with four datasets and two clinically relevant tasks.

Materials and methods: A total of 1,419 head-and-neck cancer patients' computed tomography images, gross tumor volume segmentation, and clinically relevant outcomes (distant metastasis and local-regional recurrence) were collected from four publicly available datasets. The perturbation method was implemented to simulate images, and the radiomic feature robustness was quantified using intra-class correlation of coefficient (ICC). Three radiomic models were built using all features (ICC > 0), good-robust features (ICC > 0.75), and excellent-robust features (ICC > 0.95), respectively. A filter-based feature selection and Ridge classification method were used to construct the radiomic models. Model performance was assessed with both robustness and generalizability. The robustness of the model was evaluated by the ICC, and the generalizability of the model was quantified by the train-test difference of Area Under the Receiver Operating Characteristic Curve (AUC).

Results: The average model robustness ICC improved significantly from 0.65 to 0.78 (P< 0.0001) using good-robust features and to 0.91 (P< 0.0001) using excellent-robust features. Model generalizability also showed a substantial increase, as a closer gap between training and testing AUC was observed where the mean train-test AUC difference was reduced from 0.21 to 0.18 (P< 0.001) in good-robust features and to 0.12 (P< 0.0001) in excellent-robust features. Furthermore, good-robust features yielded the best average AUC in the unseen datasets of 0.58 (P< 0.001) over four datasets and clinical outcomes.

Conclusions: Including robust only features in radiomic modeling significantly improves model robustness and generalizability in unseen datasets. Yet, the robustness of radiomic model has to be verified despite building with robust radiomic features, and tightly restricted feature robustness may prevent the optimal model performance in the unseen dataset as it may lower the discrimination power of the model.

Keywords: feature reliability; head and neck squamous cell carcinoma; model reliability; model robustness; radiomics.