Speech intelligibility estimation using multi-resolution spectral features for speakers undergoing cancer treatment

J Acoust Soc Am. 2014 Oct;136(4):EL315-21. doi: 10.1121/1.4896410.

Abstract

Head and neck cancer can significantly hamper speech production which often reduces speech intelligibility. A method of extracting spectral features is presented. The method uses a multi-resolution sinusoidal transform scheme, which enables better representation of spectral and harmonic characteristics. Regression methods were used to predict interval-scaled intelligibility scores of utterances in the NKI-CCRT speech corpus. The inclusion of these features lowered the mean squared estimation error from 0.43 to 0.39 on a scale from 1 to 7, with a p-value less than 0.001. For binary intelligibility classification, their inclusion resulted in an improvement by 5.0 percentage points when tested on a disjoint set.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Acoustics
  • Chemoradiotherapy* / adverse effects
  • Head and Neck Neoplasms / complications
  • Head and Neck Neoplasms / therapy*
  • Humans
  • Least-Squares Analysis
  • Principal Component Analysis
  • Regression Analysis
  • Signal Processing, Computer-Assisted
  • Sound Spectrography
  • Speech Acoustics*
  • Speech Intelligibility*
  • Speech Production Measurement / methods*
  • Voice Quality*