A novel strategy of integrated microarray analysis identifies CENPA, CDK1 and CDC20 as a cluster of diagnostic biomarkers in lung adenocarcinoma

Cancer Lett. 2018 Jul 1:425:43-53. doi: 10.1016/j.canlet.2018.03.043. Epub 2018 Mar 31.

Abstract

Lung adenocarcinoma (LAC) is the most lethal cancer and the leading cause of cancer-related death worldwide. The identification of meaningful clusters of co-expressed genes or representative biomarkers may help improve the accuracy of LAC diagnoses. Public databases, such as the Gene Expression Omnibus (GEO), provide rich resources of valuable information for clinics, however, the integration of multiple microarray datasets from various platforms and institutes remained a challenge. To determine potential indicators of LAC, we performed genome-wide relative significance (GWRS), genome-wide global significance (GWGS) and support vector machine (SVM) analyses progressively to identify robust gene biomarker signatures from 5 different microarray datasets that included 330 samples. The top 200 genes with robust signatures were selected for integrative analysis according to "guilt-by-association" methods, including protein-protein interaction (PPI) analysis and gene co-expression analysis. Of these 200 genes, only 10 genes showed both intensive PPI network and high gene co-expression correlation (r > 0.8). IPA analysis of this regulatory networks suggested that the cell cycle process is a crucial determinant of LAC. CENPA, as well as two linked hub genes CDK1 and CDC20, are determined to be potential indicators of LAC. Immunohistochemical staining showed that CENPA, CDK1 and CDC20 were highly expressed in LAC cancer tissue with co-expression patterns. A Cox regression model indicated that LAC patients with CENPA+/CDK1+ and CENPA+/CDC20+ were high-risk groups in terms of overall survival. In conclusion, our integrated microarray analysis demonstrated that CENPA, CDK1 and CDC20 might serve as novel cluster of prognostic biomarkers for LAC, and the cooperative unit of three genes provides a technically simple approach for identification of LAC patients.

Keywords: Biomarker; CENPA/CDK1/CDC20; GEO; Lung adenocarcinoma.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adenocarcinoma of Lung / genetics
  • Adenocarcinoma of Lung / metabolism*
  • Adenocarcinoma of Lung / pathology
  • Biomarkers, Tumor / genetics
  • Biomarkers, Tumor / metabolism
  • CDC2 Protein Kinase / genetics
  • CDC2 Protein Kinase / metabolism*
  • Cdc20 Proteins / genetics
  • Cdc20 Proteins / metabolism*
  • Centromere Protein A / genetics
  • Centromere Protein A / metabolism*
  • Computational Biology / methods*
  • Early Detection of Cancer
  • Female
  • Gene Expression Profiling
  • Gene Expression Regulation, Neoplastic
  • Humans
  • Lung Neoplasms / genetics
  • Lung Neoplasms / metabolism*
  • Lung Neoplasms / pathology
  • Male
  • Neoplasm Staging
  • Prognosis
  • Protein Interaction Maps
  • Support Vector Machine
  • Survival Analysis
  • Tissue Array Analysis
  • Up-Regulation

Substances

  • Biomarkers, Tumor
  • CENPA protein, human
  • Cdc20 Proteins
  • Centromere Protein A
  • CDC20 protein, human
  • CDC2 Protein Kinase
  • CDK1 protein, human