Two-step feature selection for predicting survival time of patients with metastatic castrate resistant prostate cancer

F1000Res. 2016 Nov 16:5:2678. doi: 10.12688/f1000research.8201.1. eCollection 2016.

Abstract

Metastatic castrate resistant prostate cancer (mCRPC) is the major cause of death in prostate cancer patients. Even though some options for treatment of mCRPC have been developed, the most effective therapies remain unclear. Thus finding key patient clinical variables related with mCRPC is an important issue for understanding the disease progression mechanism of mCRPC and clinical decision making for these patients. The Prostate Cancer DREAM Challenge is a crowd-based competition to tackle this essential challenge using new large clinical datasets. This paper proposes an effective procedure for predicting global risks and survival times of these patients, aimed at sub-challenge 1a and 1b of the Prostate Cancer DREAM challenge. The procedure implements a two-step feature selection procedure, which first implements sparse feature selection for numerical clinical variables and statistical hypothesis testing of differences between survival curves caused by categorical clinical variables, and then implements a forward feature selection to narrow the list of informative features. Using Cox's proportional hazards model with these selected features, this method predicted global risk and survival time of patients using a linear model whose input is a median time computed from the hazard model. The challenge results demonstrated that the proposed procedure outperforms the state of the art model by correctly selecting more informative features on both the global risk prediction and the survival time prediction.

Keywords: Cox-proportional hazards model; Survival analysis; feature selection.

Grants and funding

This work is partially supported by JSPS KAKENHI 25870322.