Automating annotation of information-giving for analysis of clinical conversation

Elijah Mayfield; M Barton Laws; Ira B Wilson; Carolyn Penstein Rosé

doi:10.1136/amiajnl-2013-001898

Automating annotation of information-giving for analysis of clinical conversation

J Am Med Inform Assoc. 2014 Feb;21(e1):e122-8. doi: 10.1136/amiajnl-2013-001898. Epub 2013 Sep 12.

Authors

Elijah Mayfield¹, M Barton Laws, Ira B Wilson, Carolyn Penstein Rosé

Affiliation

¹ Language Technologies Institute, Carnegie Mellon University, Pittsburgh, Pennsylvania, USA.

Abstract

Objective: Coding of clinical communication for fine-grained features such as speech acts has produced a substantial literature. However, annotation by humans is laborious and expensive, limiting application of these methods. We aimed to show that through machine learning, computers could code certain categories of speech acts with sufficient reliability to make useful distinctions among clinical encounters.

Materials and methods: The data were transcripts of 415 routine outpatient visits of HIV patients which had previously been coded for speech acts using the Generalized Medical Interaction Analysis System (GMIAS); 50 had also been coded for larger scale features using the Comprehensive Analysis of the Structure of Encounters System (CASES). We aggregated selected speech acts into information-giving and requesting, then trained the machine to automatically annotate using logistic regression classification. We evaluated reliability by per-speech act accuracy. We used multiple regression to predict patient reports of communication quality from post-visit surveys using the patient and provider information-giving to information-requesting ratio (briefly, information-giving ratio) and patient gender.

Results: Automated coding produces moderate reliability with human coding (accuracy 71.2%, κ=0.57), with high correlation between machine and human prediction of the information-giving ratio (r=0.96). The regression significantly predicted four of five patient-reported measures of communication quality (r=0.263-0.344).

Discussion: The information-giving ratio is a useful and intuitive measure for predicting patient perception of provider-patient communication quality. These predictions can be made with automated annotation, which is a practical option for studying large collections of clinical encounters with objectivity, consistency, and low cost, providing greater opportunity for training and reflection for care providers.

Keywords: Automated Annotation; Computational Linguistics; Machine Learning; Patient-provider Communication.

Publication types

Research Support, N.I.H., Extramural
Research Support, U.S. Gov't, Non-P.H.S.
Research Support, U.S. Gov't, P.H.S.

MeSH terms

Artificial Intelligence*
Clinical Coding / methods*
Communication
Electronic Data Processing*
HIV Infections
Humans
Medical Records / classification
Speech*

Abstract

Publication types

MeSH terms

Grants and funding