A network-based method to evaluate quality of reproducibility of differential expression in cancer genomics studies

Oncotarget. 2015 Dec 29;6(42):44714-27. doi: 10.18632/oncotarget.5987.

Abstract

Background: Personalized cancer treatments depend on the determination of a patient's genetic status according to known genetic profiles for which targeted treatments exist. Such genetic profiles must be scientifically validated before they is applied to general patient population. Reproducibility of findings that support such genetic profiles is a fundamental challenge in validation studies. The percentage of overlapping genes (POG) criterion and derivative methods produce unstable and misleading results. Furthermore, in a complex disease, comparisons between different tumor subtypes can produce high POG scores that do not capture the consistencies in the functions.

Results: We focused on the quality rather than the quantity of the overlapping genes. We defined the rank value of each gene according to importance or quality by PageRank on basis of a particular topological structure. Then, we used the p-value of the rank-sum of the overlapping genes (PRSOG) to evaluate the quality of reproducibility. Though the POG scores were low in different studies of the same disease, the PRSOG was statistically significant, which suggests that sets of differentially expressed genes might be highly reproducible.

Conclusions: Evaluations of eight datasets from breast cancer, lung cancer and four other disorders indicate that quality-based PRSOG method performs better than a quantity-based method. Our analysis of the components of the sets of overlapping genes supports the utility of the PRSOG method.

Keywords: cancer genomics; gene expression; overlapping genes; pagerank; reproducibility.

Publication types

  • Comparative Study
  • Evaluation Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Biomarkers, Tumor / genetics*
  • Breast Neoplasms / genetics*
  • Breast Neoplasms / pathology
  • Computational Biology
  • Databases, Genetic
  • Female
  • Gene Expression Profiling / methods*
  • Gene Expression Profiling / standards
  • Gene Expression Regulation, Neoplastic*
  • Gene Regulatory Networks*
  • Genetic Predisposition to Disease
  • Genomics / methods*
  • Genomics / standards
  • Humans
  • Lung Neoplasms / genetics*
  • Lung Neoplasms / pathology
  • Male
  • Quality Control
  • Reproducibility of Results

Substances

  • Biomarkers, Tumor