PVR-Vocoder: A Pathological Voice Repair Vocoder for Voice Disorders

IEEE J Biomed Health Inform. 2023 Dec 12:PP. doi: 10.1109/JBHI.2023.3340738. Online ahead of print.

Abstract

Vocoder-based speech synthesis has become a promising technique to accommodate the demands of high-quality speech analysis, manipulation, and synthesis. However, most existing works focus on how to synthesize normal human voice with high signal-to-noise ratio, neglecting individuals' pathological voice disorder in speech interaction. In this work, we propose a non-linear voice repair vocoder for pathological vowels and sentences, which takes the pathological speech as input and generates high-quality repaired speech. Our approach is specifically designed to enhance the speech quality and intelligibility for individuals with voice disorders. We employ amplitude modulated-frequency modulated (AM-FM) and Teager energy operation techniques to enhance the quality of pitch and spectral envelope. To tackle the instability and fracture problem of pitch, we present spectral tracking algorithm, which not only avoids dramatic change in the edge of voice, but also reduces the errors of half-pitch. Furthermore, we design a spectral reconstruction algorithm, which can effectively rebuild the spectral structure by energy operation to accomplish spectral envelope repair. The proposed PVR-Vocoder shows exceptional performance in pathological voice intelligibility enhancement according to various quality measures including objective indicators, subjective evaluation, and spectrum observations.