Identification of an Efficient Gene Expression Panel for Glioblastoma Classification

PLoS One. 2016 Nov 17;11(11):e0164649. doi: 10.1371/journal.pone.0164649. eCollection 2016.

Abstract

We present here a novel genetic algorithm-based random forest (GARF) modeling technique that enables a reduction in the complexity of large gene disease signatures to highly accurate, greatly simplified gene panels. When applied to 803 glioblastoma multiforme samples, this method allowed the 840-gene Verhaak et al. gene panel (the standard in the field) to be reduced to a 48-gene classifier, while retaining 90.91% classification accuracy, and outperforming the best available alternative methods. Additionally, using this approach we produced a 32-gene panel which allows for better consistency between RNA-seq and microarray-based classifications, improving cross-platform classification retention from 69.67% to 86.07%. A webpage producing these classifications is available at http://simplegbm.semel.ucla.edu.

MeSH terms

  • Algorithms
  • Brain Neoplasms / genetics*
  • Brain Neoplasms / mortality
  • Computational Biology / methods*
  • Datasets as Topic
  • Gene Expression Profiling* / methods
  • Genomics / methods
  • Glioblastoma / genetics*
  • Glioblastoma / mortality
  • Humans
  • Kaplan-Meier Estimate
  • Molecular Sequence Annotation
  • Prognosis
  • Reproducibility of Results
  • Transcriptome*
  • Web Browser