Construction of diagnosis system and gene regulatory networks based on microarray analysis

J Biomed Inform. 2018 May:81:61-73. doi: 10.1016/j.jbi.2018.03.008. Epub 2018 Mar 14.

Abstract

A microarray analysis generally contains expression data of thousands of genes, but most of them are irrelevant to the disease of interest, making analyzing the genes concerning specific diseases complicated. Therefore, filtering out a few essential genes as well as their regulatory networks is critical, and a disease can be easily diagnosed just depending on the expression profiles of a few critical genes. In this study, a target gene screening (TGS) system, which is a microarray-based information system that integrates F-statistics, pattern recognition matching, a two-layer K-means classifier, a Parameter Detection Genetic Algorithm (PDGA), a genetic-based gene selector (GBG selector) and the association rule, was developed to screen out a small subset of genes that can discriminate malignant stages of cancers. During the first stage, F-statistic, pattern recognition matching, and a two-layer K-means classifier were applied in the system to filter out the 20 critical genes most relevant to ovarian cancer from 9600 genes, and the PDGA was used to decide the fittest values of the parameters for these critical genes. Among the 20 critical genes, 15 are associated with cancer progression. In the second stage, we further employed a GBG selector and the association rule to screen out seven target gene sets, each with only four to six genes, and each of which can precisely identify the malignancy stage of ovarian cancer based on their expression profiles. We further deduced the gene regulatory networks of the 20 critical genes by applying the Pearson correlation coefficient to evaluate the correlationship between the expression of each gene at the same stages and at different stages. Correlationships between gene pairs were calculated, and then, three regulatory networks were deduced. Their correlationships were further confirmed by the Ingenuity pathway analysis. The prognostic significances of the genes identified via regulatory networks were examined using online tools, and most represented biomarker candidates. In summary, our proposed system provides a new strategy to identify critical genes or biomarkers, as well as their regulatory networks, from microarray data.

Keywords: Feature gene selection; Microarray analysis; Regulatory networks; Target gene; Two-layer classifier.

MeSH terms

  • Algorithms
  • Biomarkers, Tumor / genetics*
  • Databases, Genetic
  • Female
  • Gene Expression Profiling*
  • Gene Regulatory Networks*
  • Humans
  • Kaplan-Meier Estimate
  • Neoplasm Staging / methods*
  • Oligonucleotide Array Sequence Analysis*
  • Ovarian Neoplasms / diagnosis
  • Ovarian Neoplasms / genetics*
  • Pattern Recognition, Automated
  • Prognosis

Substances

  • Biomarkers, Tumor