Early detection of squamous cell carcinoma of the oral tongue using multidimensional plasma protein analysis and interpretable machine learning

J Oral Pathol Med. 2023 Aug;52(7):637-643. doi: 10.1111/jop.13461. Epub 2023 Jul 10.

Abstract

Background: Interpretable machine learning (ML) for early detection of cancer has the potential to improve risk assessment and early intervention.

Methods: Data from 261 proteins related to inflammation and/or tumor processes in 123 blood samples collected from healthy persons, but of whom a sub-group later developed squamous cell carcinoma of the oral tongue (SCCOT), were analyzed. Samples from people who developed SCCOT within less than 5 years were classified as tumor-to-be and all other samples as tumor-free. The optimal ML algorithm for feature selection was identified and feature importance computed by the SHapley Additive exPlanations (SHAP) method. Five popular ML algorithms (AdaBoost, Artificial neural networks [ANNs], Decision Tree [DT], eXtreme Gradient Boosting [XGBoost], and Support Vector Machine [SVM]) were applied to establish prediction models, and decisions of the optimal models were interpreted by SHAP.

Results: Using the 22 selected features, the SVM prediction model showed the best performance (sensitivity = 0.867, specificity = 0.859, balanced accuracy = 0.863, area under the receiver operating characteristic curve [ROC-AUC] = 0.924). SHAP analysis revealed that the 22 features rendered varying person-specific impacts on model decision and the top three contributors to prediction were Interleukin 10 (IL10), TNF Receptor Associated Factor 2 (TRAF2), and Kallikrein Related Peptidase 12 (KLK12).

Conclusion: Using multidimensional plasma protein analysis and interpretable ML, we outline a systematic approach for early detection of SCCOT before the appearance of clinical signs.

Keywords: SCCOT; SHAP; interpretable model; machine learning; plasma protein.

MeSH terms

  • Blood Proteins
  • Carcinoma, Squamous Cell* / diagnosis
  • Humans
  • Machine Learning
  • Tongue
  • Tongue Neoplasms* / diagnosis
  • Ubiquitin-Protein Ligases

Substances

  • Blood Proteins
  • Ubiquitin-Protein Ligases