compound.Cox: Univariate feature selection and compound covariate for predicting survival

Comput Methods Programs Biomed. 2019 Jan:168:21-37. doi: 10.1016/j.cmpb.2018.10.020. Epub 2018 Oct 27.

Abstract

Background and objective: Univariate feature selection is one of the simplest and most commonly used techniques to develop a multigene predictor for survival. Presently, there is no software tailored to perform univariate feature selection and predictor construction.

Methods: We develop the compound.Cox R package that implements univariate significance tests (via the Wald tests or score tests) for feature selection. We provide a cross-validation algorithm to measure predictive capability of selected genes and a permutation algorithm to assess the false discovery rate. We also provide three algorithms for constructing a multigene predictor (compound covariate, compound shrinkage, and copula-based methods), which are tailored to the subset of genes obtained from univariate feature selection. We demonstrate our package using survival data on the lung cancer patients. We examine the predictive capability of the developed algorithms by the lung cancer data and simulated data.

Results: The developed R package, compound.Cox, is available on the CRAN repository. The statistical tools in compound.Cox allow researchers to determine an optimal significance level of the tests, thus providing researchers an optimal subset of genes for prediction. The package also allows researchers to compute the false discovery rate and various prediction algorithms.

Keywords: Cancer prognosis; Copula; Cox regression; Cross-validation; Dependent censoring; False discovery rate; Gene expression; High-dimensional data; Multiple testing.

MeSH terms

  • Algorithms
  • Computer Simulation
  • False Positive Reactions
  • Gene Expression Profiling*
  • Humans
  • Kaplan-Meier Estimate
  • Lung Neoplasms / diagnosis
  • Lung Neoplasms / epidemiology
  • Lung Neoplasms / mortality*
  • Multivariate Analysis
  • Predictive Value of Tests
  • Proportional Hazards Models
  • Software*