Genetic algorithm to estimate the input parameters of Klatt and HLSyn formant-based speech synthesizers

Biosystems. 2016 Dec:150:190-193. doi: 10.1016/j.biosystems.2016.10.002. Epub 2016 Oct 18.

Abstract

Voice imitation basically consists in estimating a synthesizer's input parameters to mimic a target speech signal. This is a difficult inverse problem because the mapping is time-varying, non-linear and from many to one. It typically requires considerable amount of time to be done manually. This work presents the evolution of a system based on a genetic algorithm (GA) to automatically estimate the input parameters of the Klatt and HLSyn formant synthesizers using an analysis-by-synthesis process. Results are presented for natural (human-generated) speech for three male speakers. The results obtained with the GA-based system outperform those obtained with the baseline Winsnoori with respect to four objective figures of merit and a subjective test. The GA with Klatt synthesizer generated similar voices to the target and the subjective tests indicate an improvement in the quality of the synthetic voices when compared to the ones produced by the baseline.

Keywords: Evolutionary computation; Genetic algorithm; Speech synthesis.

MeSH terms

  • Algorithms*
  • Communication Aids for Disabled* / trends
  • Female
  • Humans
  • Male
  • Models, Genetic*
  • Speech* / physiology