Use of Automated Thematic Annotations for Small Data Sets in a Psychotherapeutic Context: Systematic Review of Machine Learning Algorithms

Alexandre Hudon; Mélissa Beaudoin; Kingsada Phraxayavong; Laura Dellazizzo; Stéphane Potvin; Alexandre Dumais

doi:10.2196/22651

Use of Automated Thematic Annotations for Small Data Sets in a Psychotherapeutic Context: Systematic Review of Machine Learning Algorithms

JMIR Ment Health. 2021 Oct 22;8(10):e22651. doi: 10.2196/22651.

Authors

Alexandre Hudon^{1

2}, Mélissa Beaudoin^{1

2}, Kingsada Phraxayavong³, Laura Dellazizzo^{1

2}, Stéphane Potvin^{1

2}, Alexandre Dumais^{1

2

3

4}

Affiliations

¹ Centre de recherche de l'Institut Universitaire en Santé Mentale de Montréal, Montréal, QC, Canada.
² Department of Psychiatry and Addictology, Faculty of Medicine, Université de Montréal, Montréal, QC, Canada.
³ Services et Recherches Psychiatriques AD, Montréal, QC, Canada.
⁴ Institut national de psychiatrie légale Philippe-Pinel, Montréal, QC, Canada.

PMID: 34677133
PMCID: PMC8571689
DOI: 10.2196/22651

Abstract

Background: A growing body of literature has detailed the use of qualitative analyses to measure the therapeutic processes and intrinsic effectiveness of psychotherapies, which yield small databases. Nonetheless, these approaches have several limitations and machine learning algorithms are needed.

Objective: The objective of this study is to conduct a systematic review of the use of machine learning for automated text classification for small data sets in the fields of psychiatry, psychology, and social sciences. This review will identify available algorithms and assess if automated classification of textual entities is comparable to the classification done by human evaluators.

Methods: A systematic search was performed in the electronic databases of Medline, Web of Science, PsycNet (PsycINFO), and Google Scholar from their inception dates to 2021. The fields of psychiatry, psychology, and social sciences were selected as they include a vast array of textual entities in the domain of mental health that can be reviewed. Additional records identified through cross-referencing were used to find other studies.

Results: This literature search identified 5442 articles that were eligible for our study after the removal of duplicates. Following abstract screening, 114 full articles were assessed in their entirety, of which 107 were excluded. The remaining 7 studies were analyzed. Classification algorithms such as naive Bayes, decision tree, and support vector machine classifiers were identified. Support vector machine is the most used algorithm and best performing as per the identified articles. Prediction classification scores for the identified algorithms ranged from 53%-91% for the classification of textual entities in 4-7 categories. In addition, 3 of the 7 studies reported an interjudge agreement statistic; these were consistent with agreement statistics for text classification done by human evaluators.

Conclusions: A systematic review of available machine learning algorithms for automated text classification for small data sets in several fields (psychiatry, psychology, and social sciences) was conducted. We compared automated classification with classification done by human evaluators. Our results show that it is possible to automatically classify textual entities of a transcript based solely on small databases. Future studies are nevertheless needed to assess whether such algorithms can be implemented in the context of psychotherapies.

Keywords: artificial intelligence; automated text classification; machine learning; psychotherapy; systematic review.

©Alexandre Hudon, Mélissa Beaudoin, Kingsada Phraxayavong, Laura Dellazizzo, Stéphane Potvin, Alexandre Dumais. Originally published in JMIR Mental Health (https://mental.jmir.org), 22.10.2021.

Publication types

Review