Investigating the prediction ability of survival models based on both clinical and omics data: two case studies

Stat Med. 2014 Dec 30;33(30):5310-29. doi: 10.1002/sim.6246. Epub 2014 Jul 9.

Abstract

In biomedical literature, numerous prediction models for clinical outcomes have been developed based either on clinical data or, more recently, on high-throughput molecular data (omics data). Prediction models based on both types of data, however, are less common, although some recent studies suggest that a suitable combination of clinical and molecular information may lead to models with better predictive abilities. This is probably due to the fact that it is not straightforward to combine data with different characteristics and dimensions (poorly characterized high-dimensional omics data, well-investigated low-dimensional clinical data). In this paper, we analyze two publicly available datasets related to breast cancer and neuroblastoma, respectively, in order to show some possible ways to combine clinical and omics data into a prediction model of time-to-event outcome. Different strategies and statistical methods are exploited. The results are compared and discussed according to different criteria, including the discriminative ability of the models, computed on a validation dataset.

Keywords: clinical information; combining clinical and omics data; high-dimensional data; prediction models; survival analysis.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adult
  • Biometry / methods*
  • Breast Neoplasms / genetics
  • Child, Preschool
  • Data Interpretation, Statistical*
  • Databases, Genetic
  • Female
  • Humans
  • Infant
  • Male
  • Middle Aged
  • Neuroblastoma / genetics
  • Proportional Hazards Models*
  • Regression Analysis
  • Survival Analysis
  • Time Factors