A technology prototype system for rating therapist empathy from audio recordings in addiction counseling

PeerJ Comput Sci. 2016 Apr:2:e59. doi: 10.7717/peerj-cs.59. Epub 2016 Apr 20.

Abstract

Scaling up psychotherapy services such as for addiction counseling is a critical societal need. One challenge is ensuring quality of therapy, due to the heavy cost of manual observational assessment. This work proposes a speech technology-based system to automate the assessment of therapist empathy-a key therapy quality index-from audio recordings of the psychotherapy interactions. We designed a speech processing system that includes voice activity detection and diarization modules, and an automatic speech recognizer plus a speaker role matching module to extract the therapist's language cues. We employed Maximum Entropy models, Maximum Likelihood language models, and a Lattice Rescoring method to characterize high vs. low empathic language. We estimated therapy-session level empathy codes using utterance level evidence obtained from these models. Our experiments showed that the fully automated system achieved a correlation of 0.643 between expert annotated empathy codes and machine-derived estimations, and an accuracy of 81% in classifying high vs. low empathy, in comparison to a 0.721 correlation and 86% accuracy in the oracle setting using manual transcripts. The results show that the system provides useful information that can contribute to automatic quality insurance and therapist training.

Keywords: Addiction counseling; Automatic speech recognition; Empathy modeling; Language modeling; Prototype system; Speech processing.