Automatic Speaker Recognition System Based on Gaussian Mixture Models, Cepstral Analysis, and Genetic Selection of Distinctive Features

Sensors (Basel). 2022 Dec 1;22(23):9370. doi: 10.3390/s22239370.

Abstract

This article presents the Automatic Speaker Recognition System (ASR System), which successfully resolves problems such as identification within an open set of speakers and the verification of speakers in difficult recording conditions similar to telephone transmission conditions. The article provides complete information on the architecture of the various internal processing modules of the ASR System. The speaker recognition system proposed in the article, has been compared very closely to other competing systems, achieving improved speaker identification and verification results, on known certified voice dataset. The ASR System owes this to the dual use of genetic algorithms both in the feature selection process and in the optimization of the system's internal parameters. This was also influenced by the proprietary feature generation and corresponding classification process using Gaussian mixture models. This allowed the development of a system that makes an important contribution to the current state of the art in speaker recognition systems for telephone transmission applications with known speech coding standards.

Keywords: Gaussian mixture model; cepstral analysis; genetic algorithms; speaker recognition; system comparison; system identification; system verification.

MeSH terms

  • Recognition, Psychology
  • Selection, Genetic
  • Speech Perception*
  • Speech Recognition Software
  • Speech*

Grants and funding

This research received no external funding.