Cognitive analysis of metabolomics data for systems biology

Nat Protoc. 2021 Mar;16(3):1376-1418. doi: 10.1038/s41596-020-00455-4. Epub 2021 Jan 22.

Abstract

Cognitive computing is revolutionizing the way big data are processed and integrated, with artificial intelligence (AI) natural language processing (NLP) platforms helping researchers to efficiently search and digest the vast scientific literature. Most available platforms have been developed for biomedical researchers, but new NLP tools are emerging for biologists in other fields and an important example is metabolomics. NLP provides literature-based contextualization of metabolic features that decreases the time and expert-level subject knowledge required during the prioritization, identification and interpretation steps in the metabolomics data analysis pipeline. Here, we describe and demonstrate four workflows that combine metabolomics data with NLP-based literature searches of scientific databases to aid in the analysis of metabolomics data and their biological interpretation. The four procedures can be used in isolation or consecutively, depending on the research questions. The first, used for initial metabolite annotation and prioritization, creates a list of metabolites that would be interesting for follow-up. The second workflow finds literature evidence of the activity of metabolites and metabolic pathways in governing the biological condition on a systems biology level. The third is used to identify candidate biomarkers, and the fourth looks for metabolic conditions or drug-repurposing targets that the two diseases have in common. The protocol can take 1-4 h or more to complete, depending on the processing time of the various software used.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Animals
  • Artificial Intelligence
  • Big Data
  • Data Analysis
  • Databases, Factual
  • Humans
  • Mass Spectrometry
  • Metabolic Networks and Pathways
  • Metabolomics / methods*
  • Natural Language Processing*
  • Software
  • Systems Biology / methods*
  • Workflow