Prediction of HER2 status via random forest in 3257 Chinese patients with gastric cancer

Clin Exp Med. 2023 Dec;23(8):5015-5024. doi: 10.1007/s10238-023-01111-3. Epub 2023 Jun 15.

Abstract

The accurate evaluation of human epidermal growth factor receptor 2 (HER2) is crucial for successful trastuzumab-based therapy in individuals with gastric cancer (GC). The present study, involving a retrospective cohort (N = 2865) from Wuhan Union Hospital and a prospective cohort (N = 392) from Renmin Hospital of Wuhan University, evaluated the benefits of clinical features using random forest and logistic regression models for the detection of HER2 status in patients with GC. Patients from the Union cohort were randomly assigned to either a training (N = 2005) or an internal validation (N = 860) group. Data processing and feature selection were done in Python, which was also used to build random forest and logistic regression models for the prediction of HER2 overexpression. The Renmin cohort (N = 392) was used as the external validation group. Ten features were closely correlated with HER2 overexpression, including age, albumin/globulin ratio, globulin, activated partial thromboplastin time, tumor stage, node stage, tumor node metastasis stage, tumor size, tumor differentiation, and neuron-specific enolase (NSE). Random forest and logistic regression had areas under the curve (AUC) of 0.9995 and 0.6653 in the training group and 0.923 and 0.667 in the internal validation group, respectively. When the two predictive models were validated using data from the Renmin cohort, random forest and logistic regression had AUCs of 0.9994 and 0.627, respectively. This is the first multicenter study to predict HER2 overexpression in individuals with GC, based on clinical variables. The random forest model significantly outperformed the logistic regression model.

Keywords: Gastric cancer; Human epidermal growth factor receptor 2; Logistic regression; Predictive performance; Random forest.

Publication types

  • Multicenter Study

MeSH terms

  • China
  • Humans
  • Prospective Studies
  • Random Forest
  • Retrospective Studies
  • Stomach Neoplasms* / genetics
  • Stomach Neoplasms* / pathology
  • Trastuzumab

Substances

  • Trastuzumab