Interpretability of radiomics models is improved when using feature group selection strategies for predicting molecular and clinical targets in clear-cell renal cell carcinoma: insights from the TRACERx Renal study

Cancer Imaging. 2023 Aug 14;23(1):76. doi: 10.1186/s40644-023-00594-3.

Abstract

Background: The aim of this work is to evaluate the performance of radiomics predictions for a range of molecular, genomic and clinical targets in patients with clear cell renal cell carcinoma (ccRCC) and demonstrate the impact of novel feature selection strategies and sub-segmentations on model interpretability.

Methods: Contrast-enhanced CT scans from the first 101 patients recruited to the TRACERx Renal Cancer study (NCT03226886) were used to derive radiomics classification models to predict 20 molecular, histopathology and clinical target variables. Manual 3D segmentation was used in conjunction with automatic sub-segmentation to generate radiomics features from the core, rim, high and low enhancing sub-regions, and the whole tumour. Comparisons were made between two classification model pipelines: a Conventional pipeline reflecting common radiomics practice, and a Proposed pipeline including two novel feature selection steps designed to improve model interpretability. For both pipelines nested cross-validation was used to estimate prediction performance and tune model hyper-parameters, and permutation testing was used to evaluate the statistical significance of the estimated performance measures. Further model robustness assessments were conducted by evaluating model variability across the cross-validation folds.

Results: Classification performance was significant (p < 0.05, H0:AUROC = 0.5) for 11 of 20 targets using either pipeline and for these targets the AUROCs were within ± 0.05 for the two pipelines, except for one target where the Proposed pipeline performance increased by > 0.1. Five of these targets (necrosis on histology, presence of renal vein invasion, overall histological stage, linear evolutionary subtype and loss of 9p21.3 somatic alteration marker) had AUROC > 0.8. Models derived using the Proposed pipeline contained fewer feature groups than the Conventional pipeline, leading to more straightforward model interpretations without loss of performance. Sub-segmentations lead to improved performance and/or improved interpretability when predicting the presence of sarcomatoid differentiation and tumour stage.

Conclusions: Use of the Proposed pipeline, which includes the novel feature selection methods, leads to more interpretable models without compromising prediction performance.

Trial registration: NCT03226886 (TRACERx Renal).

Keywords: Feature selection; Group selection; Histology; Interpretable; Machine learning; Molecular subtyping; Nested validation; Radiogenomics; Radiomics; Renal cancer.

Publication types

  • Clinical Trial

MeSH terms

  • Carcinoma, Renal Cell* / diagnostic imaging
  • Carcinoma, Renal Cell* / genetics
  • Carcinoma, Renal Cell* / pathology
  • Diagnosis, Differential
  • Humans
  • Kidney Neoplasms* / pathology
  • Radionuclide Imaging
  • Retrospective Studies
  • Tomography, X-Ray Computed / methods

Associated data

  • ClinicalTrials.gov/NCT03226886