Automatic Speaker Recognition System Based on Gaussian Mixture Models, Cepstral Analysis, and Genetic Selection of Distinctive Features

Kamil A Kamiński; Andrzej P Dobrowolski

doi:10.3390/s22239370

Automatic Speaker Recognition System Based on Gaussian Mixture Models, Cepstral Analysis, and Genetic Selection of Distinctive Features

Sensors (Basel). 2022 Dec 1;22(23):9370. doi: 10.3390/s22239370.

Authors

Kamil A Kamiński^{1

2}, Andrzej P Dobrowolski³

Affiliations

¹ Institute of Optoelectronics, Military University of Technology, 2 Kaliski Street, 00-908 Warsaw, Poland.
² BITRES Sp. z o.o., 9/2 Chałubiński Street, 02-004 Warsaw, Poland.
³ Faculty of Electronics, Military University of Technology, 2 Kaliski Street, 00-908 Warsaw, Poland.

Abstract

This article presents the Automatic Speaker Recognition System (ASR System), which successfully resolves problems such as identification within an open set of speakers and the verification of speakers in difficult recording conditions similar to telephone transmission conditions. The article provides complete information on the architecture of the various internal processing modules of the ASR System. The speaker recognition system proposed in the article, has been compared very closely to other competing systems, achieving improved speaker identification and verification results, on known certified voice dataset. The ASR System owes this to the dual use of genetic algorithms both in the feature selection process and in the optimization of the system's internal parameters. This was also influenced by the proprietary feature generation and corresponding classification process using Gaussian mixture models. This allowed the development of a system that makes an important contribution to the current state of the art in speaker recognition systems for telephone transmission applications with known speech coding standards.

Keywords: Gaussian mixture model; cepstral analysis; genetic algorithms; speaker recognition; system comparison; system identification; system verification.

MeSH terms

Recognition, Psychology
Selection, Genetic
Speech Perception*
Speech Recognition Software
Speech*

Grants and funding

This research received no external funding.