Wx: a neural network-based feature selection algorithm for transcriptomic data

Sci Rep. 2019 Jul 19;9(1):10500. doi: 10.1038/s41598-019-47016-8.

Abstract

Next-generation sequencing (NGS), which allows the simultaneous sequencing of billions of DNA fragments simultaneously, has revolutionized how we study genomics and molecular biology by generating genome-wide molecular maps of molecules of interest. However, the amount of information produced by NGS has made it difficult for researchers to choose the optimal set of genes. We have sought to resolve this issue by developing a neural network-based feature (gene) selection algorithm called Wx. The Wx algorithm ranks genes based on the discriminative index (DI) score that represents the classification power for distinguishing given groups. With a gene list ranked by DI score, researchers can institutively select the optimal set of genes from the highest-ranking ones. We applied the Wx algorithm to a TCGA pan-cancer gene-expression cohort to identify an optimal set of gene-expression biomarker candidates that can distinguish cancer samples from normal samples for 12 different types of cancer. The 14 gene-expression biomarker candidates identified by Wx were comparable to or outperformed previously reported universal gene expression biomarkers, highlighting the usefulness of the Wx algorithm for next-generation sequencing data. Thus, we anticipate that the Wx algorithm can complement current state-of-the-art analytical applications for the identification of biomarker candidates as an alternative method. The stand-alone and web versions of the Wx algorithm are available at https://github.com/deargen/DearWXpub and https://wx.deargendev.me/ , respectively.

Publication types

  • Research Support, Non-U.S. Gov't
  • Validation Study

MeSH terms

  • Algorithms*
  • Biomarkers, Tumor
  • Datasets as Topic
  • Gene Expression Profiling
  • Genes, Neoplasm*
  • High-Throughput Nucleotide Sequencing
  • Humans
  • Neoplasms / genetics
  • Neural Networks, Computer*
  • RNA, Neoplasm / genetics
  • Transcriptome*

Substances

  • Biomarkers, Tumor
  • RNA, Neoplasm