Integrative Deep Learning for Identifying Differentially Expressed (DE) Biomarkers

Comput Math Methods Med. 2019 Nov 2:2019:8418760. doi: 10.1155/2019/8418760. eCollection 2019.

Abstract

As a large amount of genetic data are accumulated, an effective analytical method and a significant interpretation are required. Recently, various methods of machine learning have emerged to process genetic data. In addition, machine learning analysis tools using statistical models have been proposed. In this study, we propose adding an integrated layer to the deep learning structure, which would enable the effective analysis of genetic data and the discovery of significant biomarkers of diseases. We conducted a simulation study in order to compare the proposed method with metalogistic regression and meta-SVM methods. The objective function with lasso penalty is used for parameter estimation, and the Youden J index is used for model comparison. The simulation results indicate that the proposed method is more robust for the variance of the data than metalogistic regression and meta-SVM methods. We also conducted real data (breast cancer data (TCGA)) analysis. Based on the results of gene set enrichment analysis, we obtained that TCGA multiple omics data involve significantly enriched pathways which contain information related to breast cancer. Therefore, it is expected that the proposed method will be helpful to discover biomarkers.

MeSH terms

  • Algorithms
  • Biomarkers / analysis*
  • Breast Neoplasms / diagnostic imaging*
  • Breast Neoplasms / pathology*
  • Computer Simulation
  • Deep Learning*
  • Female
  • Gene Expression Profiling
  • Genomics
  • Humans
  • Neural Networks, Computer*
  • Regression Analysis*
  • Software
  • Support Vector Machine

Substances

  • Biomarkers