Robust feature selection to predict tumor treatment outcome

Artif Intell Med. 2015 Jul;64(3):195-204. doi: 10.1016/j.artmed.2015.07.002. Epub 2015 Aug 14.

Abstract

Objective: Recurrence of cancer after treatment increases the risk of death. The ability to predict the treatment outcome can help to design the treatment planning and can thus be beneficial to the patient. We aim to select predictive features from clinical and PET (positron emission tomography) based features, in order to provide doctors with informative factors so as to anticipate the outcome of the patient treatment.

Methods: In order to overcome the small sample size problem of datasets usually met in the medical domain, we propose a novel wrapper feature selection algorithm, named HFS (hierarchical forward selection), which searches forward in a hierarchical feature subset space. Feature subsets are iteratively evaluated with the prediction performance using SVM (support vector machine). All feature subsets performing better than those at the preceding iteration are retained. Moreover, as SUV (standardized uptake value) based features have been recognized as significant predictive factors for a patient outcome, we propose to incorporate this prior knowledge into the selection procedure to improve its robustness and reduce its computational cost.

Results: Two real-world datasets from cancer patients are included in the evaluation. We extract dozens of clinical and PET-based features to characterize the patient's state, including SUV parameters and texture features. We use leave-one-out cross-validation to evaluate the prediction performance, in terms of prediction accuracy and robustness. Using SVM as the classifier, our HFS method produces accuracy values of 100% and 94% on the two datasets, respectively, and robustness values of 89% and 96%. Without accuracy loss, the prior-based version (pHFS) improves the robustness up to 100% and 98% on the two datasets, respectively.

Conclusions: Compared with other feature selection methods, the proposed HFS and pHFS provide the most promising results. For our HFS method, we have empirically shown that the addition of prior knowledge improves the robustness and accelerates the convergence.

Keywords: Hierarchical forward feature selection; Positron emission tomography; Prediction; Prior knowledge; Small sample; Support vector machine.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Databases, Factual
  • Decision Support Systems, Clinical*
  • Decision Support Techniques*
  • Disease Progression
  • Esophageal Neoplasms / diagnostic imaging
  • Esophageal Neoplasms / mortality
  • Esophageal Neoplasms / pathology
  • Esophageal Neoplasms / therapy*
  • Humans
  • Kaplan-Meier Estimate
  • Lung Neoplasms / diagnostic imaging
  • Lung Neoplasms / mortality
  • Lung Neoplasms / pathology
  • Lung Neoplasms / therapy*
  • Neoplasm Metastasis
  • Neoplasm Recurrence, Local
  • Positron-Emission Tomography
  • Predictive Value of Tests
  • Risk Assessment
  • Risk Factors
  • Support Vector Machine
  • Time Factors
  • Treatment Outcome