Feature selection for outcome prediction in oesophageal cancer using genetic algorithm and random forest classifier

Comput Med Imaging Graph. 2017 Sep:60:42-49. doi: 10.1016/j.compmedimag.2016.12.002. Epub 2016 Dec 28.

Abstract

The outcome prediction of patients can greatly help to personalize cancer treatment. A large amount of quantitative features (clinical exams, imaging, …) are potentially useful to assess the patient outcome. The challenge is to choose the most predictive subset of features. In this paper, we propose a new feature selection strategy called GARF (genetic algorithm based on random forest) extracted from positron emission tomography (PET) images and clinical data. The most relevant features, predictive of the therapeutic response or which are prognoses of the patient survival 3 years after the end of treatment, were selected using GARF on a cohort of 65 patients with a local advanced oesophageal cancer eligible for chemo-radiation therapy. The most relevant predictive results were obtained with a subset of 9 features leading to a random forest misclassification rate of 18±4% and an areas under the of receiver operating characteristic (ROC) curves (AUC) of 0.823±0.032. The most relevant prognostic results were obtained with 8 features leading to an error rate of 20±7% and an AUC of 0.750±0.108. Both predictive and prognostic results show better performances using GARF than using 4 other studied methods.

Keywords: Feature selection; Genetic algorithm; Oesophageal cancer; Radiomics; Random forest.

MeSH terms

  • Algorithms*
  • Area Under Curve
  • Esophageal Neoplasms / diagnosis*
  • Esophageal Neoplasms / diagnostic imaging
  • Esophageal Neoplasms / genetics*
  • Humans
  • Positron-Emission Tomography
  • Prognosis
  • ROC Curve