Development and validation of a five-lncRNA signature with prognostic value in colon cancer

J Cell Biochem. 2020 Aug;121(8-9):3780-3793. doi: 10.1002/jcb.29518. Epub 2019 Nov 3.

Abstract

Dysregulation of long noncoding RNAs (lncRNAs) has been found in a large number of human cancers, including colon cancer. Therefore, the implementation of potential lncRNAs biomarkers with prognostic prediction value are very much essential. GSE39582 data set was downloaded from database of Gene Expression Omnibus. Re-annotation analysis of lncRNA expression profiles was performed by NetAffx annotation files. Univariate and multivariate Cox proportional analyses helped select prognostic lncRNAs. Algorithm of random survival forest-variable hunting (RSF-VH) together with stepwise multivariate Cox proportional analysis were performed to establish lncRNA signature. The log-rank test was carried out to analyze and compare the Kaplan-Meier survival curves of patients' overall survival (OS). Receiver operating characteristic (ROC) analysis was used for comparing the survival prediction regarding its specificity and sensitivity based on lncRNA risk score, followed by calculating the values of area under the curve (AUC). The single-sample GSEA (ssGSEA) analysis was used to describe biological functions associated with this signature. Finally, to determine the robustness of this model, we used the validation sets including GSE17536 and The Cancer Genome Atlas data set. After re-annotation analysis of lncRNAs, a total of 14 lncRNA probes were obtained by univariate and multivariate Cox proportional analysis. Then, the RSF-VH algorithm and stepwise multivariate Cox analysis helped to build a five-lncRNA prognostic signature for colon cancer. The patients in group with high risk showed an obviously shorter survival time compared with patients in group with low risk with AUC of 0.75. In addition, the five-lncRNA signature can be used to independently predict the survival of patients with colon cancer. The ssGSEA analysis revealed that pathways such as extracellular matrix-receptor interaction was activated with an increase in risk score. These findings determined the strong power of prognostic prediction value of this five-lncRNA signature for colon cancer.

Keywords: GEO; colon cancer; long noncoding RNAs; prognosis; random survival forest-variable hunting algorithm.