Evaluation of Singer's Voice Quality by Means of Visual Pattern Recognition

J Voice. 2016 Jan;30(1):127.e21-30. doi: 10.1016/j.jvoice.2015.03.001. Epub 2015 Apr 30.

Abstract

The article presents a description of the algorithm of singing voice quality assessment that uses selected methods from the field of digital image processing and recognition. It adopts the assumption that an audio signal with recorded vocal exercise can be converted into a visual representation, and processed further, as an image. Presented approach is based on generating a sound spectrogram of a sample in the form of a rectangular matrix, objective improvement of its visual quality based on local changes in brightness and contrast, and scaling to a fixed size. Then, it uses a two-step approach: the construction of a representative database of reference samples and the identification of test samples. The process of building the database uses two-dimensional linear discriminant analysis. Then, the recognition operation is carried out in a reduced feature space that has been obtained by two-dimensional Karhunen-Loeve projection. Classification is done by a variant of Support Vector Machines approach. As it is shown, the results are very encouraging and are competitive to the most powerful state-of-the-art methods.

Keywords: Image processing; Image recognition; Linear discriminant analysis; Short-time Fourier transform; Singing quality; Spectrogram; Support vector machine.

MeSH terms

  • Acoustics*
  • Algorithms*
  • Discriminant Analysis
  • Female
  • Fourier Analysis
  • Humans
  • Linear Models
  • Male
  • Pattern Recognition, Automated*
  • Signal Processing, Computer-Assisted*
  • Singing*
  • Sound Spectrography
  • Support Vector Machine
  • Voice Quality*