A deep learning-based multi-model ensemble method for cancer prediction

Comput Methods Programs Biomed. 2018 Jan:153:1-9. doi: 10.1016/j.cmpb.2017.09.005. Epub 2017 Sep 14.

Abstract

Background and objective: Cancer is a complex worldwide health problem associated with high mortality. With the rapid development of the high-throughput sequencing technology and the application of various machine learning methods that have emerged in recent years, progress in cancer prediction has been increasingly made based on gene expression, providing insight into effective and accurate treatment decision making. Thus, developing machine learning methods, which can successfully distinguish cancer patients from healthy persons, is of great current interest. However, among the classification methods applied to cancer prediction so far, no one method outperforms all the others.

Methods: In this paper, we demonstrate a new strategy, which applies deep learning to an ensemble approach that incorporates multiple different machine learning models. We supply informative gene data selected by differential gene expression analysis to five different classification models. Then, a deep learning method is employed to ensemble the outputs of the five classifiers.

Results: The proposed deep learning-based multi-model ensemble method was tested on three public RNA-seq data sets of three kinds of cancers, Lung Adenocarcinoma, Stomach Adenocarcinoma and Breast Invasive Carcinoma. The test results indicate that it increases the prediction accuracy of cancer for all the tested RNA-seq data sets as compared to using a single classifier or the majority voting algorithm.

Conclusions: By taking full advantage of different classifiers, the proposed deep learning-based multi-model ensemble method is shown to be accurate and effective for cancer prediction.

Keywords: Cancer prediction; Deep learning; Feature selection; Gene expression; Multi-model ensemble.

Publication types

  • Validation Study

MeSH terms

  • Gene Expression
  • Humans
  • Machine Learning*
  • Models, Theoretical*
  • Neoplasms / diagnosis*
  • Neoplasms / genetics
  • Neural Networks, Computer