Neuroevolution as a tool for microarray gene expression pattern identification in cancer research

J Biomed Inform. 2019 Jan:89:122-133. doi: 10.1016/j.jbi.2018.11.013. Epub 2018 Dec 3.

Abstract

Microarrays are still one of the major techniques employed to study cancer biology. However, the identification of expression patterns from microarray datasets is still a significant challenge to overcome. In this work, a new approach using Neuroevolution, a machine learning field that combines neural networks and evolutionary computation, provides aid in this challenge by simultaneously classifying microarray data and selecting the subset of more relevant genes. The main algorithm, FS-NEAT, was adapted by the addition of new structural operators designed for this high dimensional data. In addition, a rigorous filtering and preprocessing protocol was employed to select quality microarray datasets for the proposed method, selecting 13 datasets from three different cancer types. The results show that Neuroevolution was able to successfully classify microarray samples when compared with other methods in the literature, while also finding subsets of genes that can be generalized for other algorithms and carry relevant biological information. This approach detected 177 genes, and 82 were validated as already being associated to their respective cancer types and 44 were associated to other types of cancer, becoming potential targets to be explored as cancer biomarkers. Five long non-coding RNAs were also detected, from which four don't have described functions yet. The expression patterns found are intrinsically related to extracellular matrix, exosomes and cell proliferation. The results obtained in this work could aid in unraveling the molecular mechanisms underlying the tumoral process and describe new potential targets to be explored in future works.

Keywords: Cancer; FS-NEAT; Feature selection; Machine learning; Microarray; Neuroevolution.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Biomarkers, Tumor / genetics
  • Humans
  • Machine Learning*
  • Neoplasms / genetics*
  • Neural Networks, Computer*
  • Oligonucleotide Array Sequence Analysis*

Substances

  • Biomarkers, Tumor