Detection of affective states from text and speech for real-time human--computer interaction

Ricardo A Calix; Leili Javadpour; Gerald M Knapp

doi:10.1177/0018720811425922

Detection of affective states from text and speech for real-time human--computer interaction

Hum Factors. 2012 Aug;54(4):530-45. doi: 10.1177/0018720811425922.

Authors

Ricardo A Calix¹, Leili Javadpour, Gerald M Knapp

Affiliation

¹ Louisiana State University, Baton Rouge, Louisiana 70803, USA.

PMID: 22908677
DOI: 10.1177/0018720811425922

Abstract

Objective: The goal of this work is to develop and test an automated system methodology that can detect emotion from text and speech features.

Background: Affective human-computer interaction will be critical for the success of new systems that will be prevalent in the 21st century. Such systems will need to properly deduce human emotional state before they can determine how to best interact with people.

Method: Corpora and machine learning classification models are used to train and test a methodology for emotion detection. The methodology uses a stepwise approach to detect sentiment in sentences by first filtering out neutral sentences, then distinguishing among positive, negative, and five emotion classes.

Results: Results of the classification between emotion and neutral sentences achieved recall accuracies as high as 77% in the University of Illinois at Urbana-Champaign (UIUC) corpus and 61% in the Louisiana State University medical drama (LSU-MD) corpus for emotion samples. Once neutral sentences were filtered out, the methodology achieved accuracy scores for detecting negative sentences as high as 92.3%.

Conclusion: Results of the feature analysis indicate that speech spectral features are better than speech prosodic features for emotion detection. Accumulated sentiment composition text features appear to be very important as well. This work contributes to the study of human communication by providing a better understanding of how language factors help to best convey human emotion and how to best automate this process.

Application: Results of this study can be used to develop better automated assistive systems that interpret human language and respond to emotions through 3-D computer graphics.

MeSH terms

Emotions
Female
Humans
Language
Male
Models, Theoretical
Tape Recording
Text Messaging*
United States
User-Computer Interface*
Verbal Behavior*