Evaluating Conversational Agents for Mental Health: Scoping Review of Outcomes and Outcome Measurement Instruments

J Med Internet Res. 2023 Apr 19:25:e44548. doi: 10.2196/44548.

Abstract

Background: Rapid proliferation of mental health interventions delivered through conversational agents (CAs) calls for high-quality evidence to support their implementation and adoption. Selecting appropriate outcomes, instruments for measuring outcomes, and assessment methods are crucial for ensuring that interventions are evaluated effectively and with a high level of quality.

Objective: We aimed to identify the types of outcomes, outcome measurement instruments, and assessment methods used to assess the clinical, user experience, and technical outcomes in studies that evaluated the effectiveness of CA interventions for mental health.

Methods: We undertook a scoping review of the relevant literature to review the types of outcomes, outcome measurement instruments, and assessment methods in studies that evaluated the effectiveness of CA interventions for mental health. We performed a comprehensive search of electronic databases, including PubMed, Cochrane Central Register of Controlled Trials, Embase (Ovid), PsychINFO, and Web of Science, as well as Google Scholar and Google. We included experimental studies evaluating CA mental health interventions. The screening and data extraction were performed independently by 2 review authors in parallel. Descriptive and thematic analyses of the findings were performed.

Results: We included 32 studies that targeted the promotion of mental well-being (17/32, 53%) and the treatment and monitoring of mental health symptoms (21/32, 66%). The studies reported 203 outcome measurement instruments used to measure clinical outcomes (123/203, 60.6%), user experience outcomes (75/203, 36.9%), technical outcomes (2/203, 1.0%), and other outcomes (3/203, 1.5%). Most of the outcome measurement instruments were used in only 1 study (150/203, 73.9%) and were self-reported questionnaires (170/203, 83.7%), and most were delivered electronically via survey platforms (61/203, 30.0%). No validity evidence was cited for more than half of the outcome measurement instruments (107/203, 52.7%), which were largely created or adapted for the study in which they were used (95/107, 88.8%).

Conclusions: The diversity of outcomes and the choice of outcome measurement instruments employed in studies on CAs for mental health point to the need for an established minimum core outcome set and greater use of validated instruments. Future studies should also capitalize on the affordances made available by CAs and smartphones to streamline the evaluation and reduce participants' input burden inherent to self-reporting.

Keywords: chatbot; conversational agent; core outcome set; mHealth; mental health; mobile health; outcomes; taxonomy.

Publication types

  • Review
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Communication
  • Humans
  • Mental Health*
  • Outcome Assessment, Health Care*