Web-based and machine learning approaches for identification of patient-reported outcomes in inflammatory bowel disease

Dig Liver Dis. 2022 Apr;54(4):483-489. doi: 10.1016/j.dld.2021.09.005. Epub 2021 Sep 26.

Abstract

Background: Messages from an Internet forum are raw material that emerges in a natural setting (i.e., non-induced by a research situation).

Aims: The FLARE-IBD project aimed at using an innovative approach consisting of collecting messages posted by patients in an Internet forum and conducting a machine-learning study (data analysis/language processing) for developing a patient-reported outcome measuring flare in inflammatory bowel disease meeting international requirements.

Methods: We used web-based and machine learning approaches, in the following steps. 1) Web-scraping to collect all available posts in an Internet forum (23 656 messages) and extracting metadata from the forum. 2) Twenty patients were randomly assigned 50 extracted messages; participants indicated whether the message corresponded or not to the flare phenomenon (labeling). If yes, participants were asked to identify excerpts from the text they considered significant flare markers (annotation). 3) The set of annotated messages underwent a vocabulary analysis.

Results: The phenomenon of flare was circumscribed with the identification of 20 surrogate flare markers classified into five dimensions with their frequency within extracted labeled data: impact on life, symptoms, extra-intestinal manifestations, drugs and environmental factors. Web-based and machine-learning approaches met international recommendations to inform the content and structure for the development of patient-reported outcomes.

Keywords: Inflammatory bowel disease; Machine learning; Patient-reported outcomes; World wide web-based approach.

Publication types

  • Randomized Controlled Trial

MeSH terms

  • Humans
  • Inflammatory Bowel Diseases* / diagnosis
  • Internet
  • Machine Learning*
  • Patient Reported Outcome Measures