MetaPheno: A critical evaluation of deep learning and machine learning in metagenome-based disease prediction

Methods. 2019 Aug 15:166:74-82. doi: 10.1016/j.ymeth.2019.03.003. Epub 2019 Mar 16.

Abstract

The human microbiome plays a number of critical roles, impacting almost every aspect of human health and well-being. Conditions in the microbiome have been linked to a number of significant diseases. Additionally, revolutions in sequencing technology have led to a rapid increase in publicly-available sequencing data. Consequently, there have been growing efforts to predict disease status from metagenomic sequencing data, with a proliferation of new approaches in the last few years. Some of these efforts have explored utilizing a powerful form of machine learning called deep learning, which has been applied successfully in several biological domains. Here, we review some of these methods and the algorithms that they are based on, with a particular focus on deep learning methods. We also perform a deeper analysis of Type 2 Diabetes and obesity datasets that have eluded improved results, using a variety of machine learning and feature extraction methods. We conclude by offering perspectives on study design considerations that may impact results and future directions the field can take to improve results and offer more valuable conclusions. The scripts and extracted features for the analyses conducted in this paper are available via GitHub:https://github.com/nlapier2/metapheno.

Keywords: Deep learning; Machine learning; Metagenomics; Phenotype prediction.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms
  • Deep Learning*
  • Diabetes Mellitus, Type 2 / genetics*
  • Diabetes Mellitus, Type 2 / microbiology
  • Humans
  • Machine Learning / statistics & numerical data
  • Metagenome / genetics*
  • Metagenomics / methods
  • Microbiota / genetics
  • Obesity / genetics*
  • Obesity / microbiology