Identifying and Analyzing Topic Clusters in a Nutri-, Food-, and Diet-Proteomic Corpus Using Machine Reading

Nutrients. 2023 Jan 5;15(2):270. doi: 10.3390/nu15020270.

Abstract

Nutrition affects the early stages of disease development, but the mechanisms remain poorly understood. High-throughput proteomic methods are being used to generate data and information on the effects of nutrients, foods, and diets on health and disease processes. In this report, a novel machine reading pipeline was used to identify all articles and abstracts on proteomics, diet, food, and nutrition in humans. The resulting proteomic corpus was further analyzed to produce seven clusters of "thematic" content defined as documents that have similar word content. Examples of publications from several of these clusters were then described in a similar way to a typical descriptive review.

Keywords: artificial intelligence; diet proteomics; food proteomics; machine reading; nutriproteomics; nutrition proteomics; transformer-based language model.

Publication types

  • Review

MeSH terms

  • Diet*
  • Food
  • Humans
  • Nutrients
  • Nutritional Status
  • Proteomics* / methods

Grants and funding

This research received no external funding.