PVR-AFM: A Pathological Voice Repair System based on Non-linear Structure

J Voice. 2023 Sep;37(5):648-662. doi: 10.1016/j.jvoice.2021.05.010. Epub 2021 Jul 5.

Abstract

Objective: Speech signal processing has become an important technique to ensure that the voice interaction system communicates accurately with the user by improving the clarity or intelligibility of speech signals. However, most existing works only focus on whether to process the voice of average human but ignore the communication needs of individuals suffering from voice disorder, including voice-related professionals, older people, and smokers. To solve this demand, it is essential to design a non-invasive repair system that processes pathological voices.

Methods: In this paper, we propose a repair system for multiple polyp vowels, such as /a/, /i/ and /u/. We utilize a non-linear model based on amplitude-modulation (AM) and a frequency-modulation (FM) structure to extract the pitch and formant of pathological voice. To solve the fracture and instability of pitch, we provide a pitch extraction algorithm, which ensures that pitch's stability and avoids the errors of double pitch caused by the instability of low-frequency signal. Furthermore, we design a formant reconstruction mechanism, which can effectively determine the frequency and bandwidth to accomplish formant repair.

Results: Finally, spectrum observation and objective indicators show that the system has better performance in improving the intelligibility of pathological speech.

Keywords: AM—FM Structure—Multiple vowels—Pathological Voice Disorder—Voice Repair.

MeSH terms

  • Aged
  • Algorithms
  • Cognition
  • Humans
  • Speech
  • Voice Disorders* / diagnosis
  • Voice*