Machine learning for the prediction of toxicities from head and neck cancer treatment: A systematic review with meta-analysis

Oral Oncol. 2023 May:140:106386. doi: 10.1016/j.oraloncology.2023.106386. Epub 2023 Apr 4.

Abstract

Introduction: The aim of the present systematic review (SR) is to summarize Machine Learning (ML) models currently used to predict head and neck cancer (HNC) treatment-related toxicities, and to understand the impact of image biomarkers (IBMs) in prediction models (PMs). The present SR was conducted following the guidelines of the PRISMA 2022 and registered in PROSPERO database (CRD42020219304).

Methods: The acronym PICOS was used to develop the focused review question (Can PMs accurately predict HNC treatment toxicities?) and the eligibility criteria. The inclusion criteria enrolled Prediction Model Studies (PMSs) with patient cohorts that were treated for HNC and developed toxicities. Electronic database search encompassed PubMed, EMBASE, Scopus, Cochrane Library, Web of Science, LILACS, and Gray Literature (Google Scholar and ProQuest). Risk of Bias (RoB) was assessed through PROBAST and the results were synthesized based on the data format (with and without IBMs) to allow comparison.

Results: A total of 28 studies and 4,713 patients were included. Xerostomia was the most frequently investigated toxicity (17; 60.71 %). Sixteen (57.14 %) studies reported using radiomics features in combination with clinical or dosimetrics/dosiomics for modelling. High RoB was identified in 23 studies. Meta-analysis (MA) showed an area under the receiver operating characteristics curve (AUROC) of 0.82 for models with IBMs and 0.81 for models without IBMs (p value < 0.001), demonstrating no difference among IBM- and non-IBM-based models.

Discussion: The development of a PM based on sample-specific features represents patient selection bias and may affect a model's performance. Heterogeneity of the studies as well as non-standardized metrics prevent proper comparison of studies, and the absence of an independent/external test does not allow the evaluation of the model's generalization ability.

Conclusion: IBM-featured PMs are not superior to PMs based on non-IBM predictors. The evidence was appraised as of low certainty.

Keywords: Convolutional neural network; Dysphagia; Hearing loss; Hypothyroidism; Machine learning; Mucosistis; Osteoradionecrosis; Prediction model studies; Xerostomia.

Publication types

  • Meta-Analysis
  • Systematic Review
  • Review
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Biomarkers
  • Head and Neck Neoplasms* / drug therapy
  • Humans
  • Machine Learning
  • Xerostomia*

Substances

  • Biomarkers