Evaluation of Singer's Voice Quality by Means of Visual Pattern Recognition

Paweł Forczmański

doi:10.1016/j.jvoice.2015.03.001

Evaluation of Singer's Voice Quality by Means of Visual Pattern Recognition

J Voice. 2016 Jan;30(1):127.e21-30. doi: 10.1016/j.jvoice.2015.03.001. Epub 2015 Apr 30.

Author

Paweł Forczmański¹

Affiliation

¹ Faculty of Computer Science and Information Technology, West Pomeranian University of Technology, Szczecin, 52 Zolnierska St., 71-210 Szczecin, Poland. Electronic address: pforczmanski@wi.zut.edu.pl.

PMID: 25935835
DOI: 10.1016/j.jvoice.2015.03.001

Abstract

The article presents a description of the algorithm of singing voice quality assessment that uses selected methods from the field of digital image processing and recognition. It adopts the assumption that an audio signal with recorded vocal exercise can be converted into a visual representation, and processed further, as an image. Presented approach is based on generating a sound spectrogram of a sample in the form of a rectangular matrix, objective improvement of its visual quality based on local changes in brightness and contrast, and scaling to a fixed size. Then, it uses a two-step approach: the construction of a representative database of reference samples and the identification of test samples. The process of building the database uses two-dimensional linear discriminant analysis. Then, the recognition operation is carried out in a reduced feature space that has been obtained by two-dimensional Karhunen-Loeve projection. Classification is done by a variant of Support Vector Machines approach. As it is shown, the results are very encouraging and are competitive to the most powerful state-of-the-art methods.

Keywords: Image processing; Image recognition; Linear discriminant analysis; Short-time Fourier transform; Singing quality; Spectrogram; Support vector machine.

MeSH terms

Acoustics*
Algorithms*
Discriminant Analysis
Female
Fourier Analysis
Humans
Linear Models
Male
Pattern Recognition, Automated*
Signal Processing, Computer-Assisted*
Singing*
Sound Spectrography
Support Vector Machine
Voice Quality*