Multiple Vowels Repair Based on Pitch Extraction and Line Spectrum Pair Feature for Voice Disorder

IEEE J Biomed Health Inform. 2020 Jul;24(7):1940-1951. doi: 10.1109/JBHI.2020.2978103. Epub 2020 Mar 3.

Abstract

Individuals, such as voice-related professionals, elderly people and smokers, are increasingly suffering from voice disorder, which implies the importance of pathological voice repair. Previous work on pathological voice repair only concerned about sustained vowel /a/, but multiple vowels repair is still challenging due to the unstable extraction of pitch and the unsatisfactory reconstruction of formant. In this paper, a multiple vowels repair based on pitch extraction and Line Spectrum Pair feature for voice disorder is proposed, which broadened the research subjects of voice repair from only single vowel /a/ to multiple vowels /a/, /i/ and /u/ and achieved the repair of these vowels successfully. Considering deep neural network as a classifier, a voice recognition is performed to classify the normal and pathological voices. Wavelet Transform and Hilbert-Huang Transform are applied for pitch extraction. Based on Line Spectrum Pair (LSP) feature, the formant is reconstructed. The final repaired voice is obtained by synthesizing the pitch and the formant. The proposed method is validated on Saarbrücken Voice Database (SVD) database. The achieved improvements of three metrics, Segmental Signal-to-Noise Ratio, LSP distance measure and Mel cepstral distance measure, are respectively 45.87%, 50.37% and 15.56%. Besides, an intuitive analysis based on spectrogram has been done and a prominent repair effect has been achieved.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Aged
  • Humans
  • Neural Networks, Computer
  • Sound Spectrography / methods*
  • Voice / physiology*
  • Voice Disorders / diagnosis*
  • Wavelet Analysis*