Objective Quality and Intelligibility Prediction for Users of Assistive Listening Devices

Tiago H Falk; Vijay Parsa; João F Santos; Kathryn Arehart; Oldooz Hazrati; Rainer Huber; James M Kates; Susan Scollie

doi:10.1109/MSP.2014.2358871

Objective Quality and Intelligibility Prediction for Users of Assistive Listening Devices

IEEE Signal Process Mag. 2015 Mar;32(2):114-124. doi: 10.1109/MSP.2014.2358871.

Authors

Tiago H Falk¹, Vijay Parsa², João F Santos¹, Kathryn Arehart³, Oldooz Hazrati⁴, Rainer Huber⁵, James M Kates³, Susan Scollie²

Affiliations

¹ INRS-EMT, University of Quebec, Montreal, QC, Canada.
² University of Western Ontario, National Centre for Audiology, London, ON, Canada.
³ Dept. Speech Language and Hearing Sciences, University of Colorado, Boulder, CO, USA.
⁴ Dept. Electrical Engineering, The University of Texas at Dallas, Richardson, TX, USA.
⁵ Center of Competence HörTech and Cluster of Excellence Hearing4All, Oldenburg, Germany.

Abstract

This article presents an overview of twelve existing objective speech quality and intelligibility prediction tools. Two classes of algorithms are presented, namely intrusive and non-intrusive, with the former requiring the use of a reference signal, while the latter does not. Investigated metrics include both those developed for normal hearing listeners, as well as those tailored particularly for hearing impaired (HI) listeners who are users of assistive listening devices (i.e., hearing aids, HAs, and cochlear implants, CIs). Representative examples of those optimized for HI listeners include the speech-to-reverberation modulation energy ratio, tailored to hearing aids (SRMR-HA) and to cochlear implants (SRMR-CI); the modulation spectrum area (ModA); the hearing aid speech quality (HASQI) and perception indices (HASPI); and the PErception MOdel - hearing impairment quality (PEMO-Q-HI). The objective metrics are tested on three subjectively-rated speech datasets covering reverberation-alone, noise-alone, and reverberation-plus-noise degradation conditions, as well as degradations resultant from nonlinear frequency compression and different speech enhancement strategies. The advantages and limitations of each measure are highlighted and recommendations are given for suggested uses of the different tools under specific environmental and processing conditions.

Abstract

Grants and funding