Design and Validation of a New Diagnostic Tool for the Differentiation of Pathological Voices in Parkinsonian Patients

Adv Exp Med Biol. 2021:1339:77-83. doi: 10.1007/978-3-030-78787-5_11.

Abstract

Pathological speech, in its many forms, is a symptom of numerous serious diseases affecting millions of people worldwide, including more than 10 million Parkinson patients. Here, a powerful method is proposed for detecting pathological speech, using a two-dimensional (2D) convolutional neural network (CNN). Spectrograms are extracted from voice recordings of healthy and Parkinson diagnosed patients, which are fed into the CNN architecture. The voice samples comprise a subset of the benchmark mobile Parkinson Disease (mPower) study. The proposed model achieves 98% accuracy in Parkinson detection (i.e., a two-class problem). Moreover, an average accuracy exceeding 94% is measured in binary tests (i.e., pathological versus healthy) employing six voice pathologies conducted on the Saarbruecken Voice Database. These pathologies are dysphonia, functional dysphonia, hyperfunctional dysphonia, spasmodic dysphonia, vocal fold polyp, and dysody.

Keywords: Audio classification; Convolutional neural network; Deep learning; Pathological speech; Saarbruecken voice database; Spectrogram; mPower study.

MeSH terms

  • Databases, Factual
  • Dysphonia*
  • Humans
  • Neural Networks, Computer
  • Parkinson Disease* / diagnosis
  • Voice*