Discovering combinatorial interactions in survival data

Bioinformatics. 2013 Dec 1;29(23):3053-9. doi: 10.1093/bioinformatics/btt532. Epub 2013 Sep 13.

Abstract

Motivation: Although several methods exist to relate high-dimensional gene expression data to various clinical phenotypes, finding combinations of features in such input remains a challenge, particularly when fitting complex statistical models such as those used for survival studies.

Results: Our proposed method builds on existing 'regularization path-following' techniques to produce regression models that can extract arbitrarily complex patterns of input features (such as gene combinations) from large-scale data that relate to a known clinical outcome. Through the use of the data's structure and itemset mining techniques, we are able to avoid combinatorial complexity issues typically encountered with such methods, and our algorithm performs in similar orders of duration as single-variable versions. Applied to data from various clinical studies of cancer patient survival time, our method was able to produce a number of promising gene-interaction candidates whose tumour-related roles appear confirmed by literature.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Breast Neoplasms / genetics
  • Breast Neoplasms / mortality*
  • Computational Biology / methods*
  • Female
  • Gene Expression Profiling
  • Gene Regulatory Networks*
  • Humans
  • Likelihood Functions
  • Logistic Models
  • Models, Biological
  • Neoplasm Proteins / genetics*
  • Neuroblastoma / genetics
  • Neuroblastoma / mortality*
  • Proportional Hazards Models
  • Risk Factors
  • Survival Rate

Substances

  • Neoplasm Proteins