Caution regarding the specificities of pan-cancer microbial structure

Microb Genom. 2023 Aug;9(8):mgen001088. doi: 10.1099/mgen.0.001088.

Abstract

Results published in an article by Poore et al. (Nature. 2020;579:567-574) suggested that machine learning models can almost perfectly distinguish between tumour types based on their microbial composition using machine learning models. Whilst we believe that there is the potential for microbial composition to be used in this manner, we have concerns with the paper that make us question the certainty of the conclusions drawn. We believe there are issues in the areas of the contribution of contamination, handling of batch effects, false positive classifications and limitations in the machine learning approaches used. This makes it difficult to identify whether the authors have identified true biological signal and how robust these models would be in use as clinical biomarkers. We commend Poore et al. on their approach to open data and reproducibility that has enabled this analysis. We hope that this discourse assists the future development of machine learning models and hypothesis generation in microbiome research.

Keywords: bacteria; cancer; contamination; machine learning; microbiome; viruses.

Publication types

  • Letter
  • Research Support, Non-U.S. Gov't
  • Comment

MeSH terms

  • Humans
  • Machine Learning
  • Microbiota*
  • Neoplasms*
  • Reproducibility of Results