Identification of a gene signature in cell cycle pathway for breast cancer prognosis using gene expression profiling data

BMC Med Genomics. 2008 Sep 11:1:39. doi: 10.1186/1755-8794-1-39.

Abstract

Background: Numerous studies have used microarrays to identify gene signatures for predicting cancer patient clinical outcome and responses to chemotherapy. However, the potential impact of gene expression profiling in cancer diagnosis, prognosis and development of personalized treatment may not be fully exploited due to the lack of consensus gene signatures and poor understanding of the underlying molecular mechanisms.

Methods: We developed a novel approach to derive gene signatures for breast cancer prognosis in the context of known biological pathways. Using unsupervised methods, cancer patients were separated into distinct groups based on gene expression patterns in one of the following pathways: apoptosis, cell cycle, angiogenesis, metastasis, p53, DNA repair, and several receptor-mediated signaling pathways including chemokines, EGF, FGF, HIF, MAP kinase, JAK and NF-kappaB. The survival probabilities were then compared between the patient groups to determine if differential gene expression in a specific pathway is correlated with differential survival.

Results: Our results revealed expression of cell cycle genes is strongly predictive of breast cancer outcomes. We further confirmed this observation by building a cell cycle gene signature model using supervised methods. Validated in multiple independent datasets, the cell cycle gene signature is a more accurate predictor for breast cancer clinical outcome than the previously identified Amsterdam 70-gene signature that has been developed into a FDA approved clinical test MammaPrint.

Conclusion: Taken together, the gene expression signature model we developed from well defined pathways is not only a consistently powerful prognosticator but also mechanistically linked to cancer biology. Our approach provides an alternative to the current methodology of identifying gene expression markers for cancer prognosis and drug responses using the whole genome gene expression data.