Impacts of multicollinearity on CAPT modalities: An heterogeneous machine learning framework for computer-assisted French phoneme pronunciation training

Yanjing Bi; Chao Li; Yannick Benezeth; Fan Yang

doi:10.1371/journal.pone.0257901

Impacts of multicollinearity on CAPT modalities: An heterogeneous machine learning framework for computer-assisted French phoneme pronunciation training

PLoS One. 2021 Oct 18;16(10):e0257901. doi: 10.1371/journal.pone.0257901. eCollection 2021.

Authors

Yanjing Bi¹, Chao Li^{2

3}, Yannick Benezeth⁴, Fan Yang⁴

Affiliations

¹ School of Foreign Studies, Capital University of Economics and Business, Beijing, China.
² Institute of Acoustics, Chinese Academy of Sciences, Beijing, China.
³ University of Chinese Academy of Sciences, Beijing, China.
⁴ Laboratory ImViA, Université Bourgogne Franche-Comté, Dijon, Burgundy, France.

Abstract

Phoneme pronunciations are usually considered as basic skills for learning a foreign language. Practicing the pronunciations in a computer-assisted way is helpful in a self-directed or long-distance learning environment. Recent researches indicate that machine learning is a promising method to build high-performance computer-assisted pronunciation training modalities. Many data-driven classifying models, such as support vector machines, back-propagation networks, deep neural networks and convolutional neural networks, are increasingly widely used for it. Yet, the acoustic waveforms of phoneme are essentially modulated from the base vibrations of vocal cords, and this fact somehow makes the predictors collinear, distorting the classifying models. A commonly-used solution to address this issue is to suppressing the collinearity of predictors via partial least square regressing algorithm. It allows to obtain high-quality predictor weighting results via predictor relationship analysis. However, as a linear regressor, the classifiers of this type possess very simple topology structures, constraining the universality of the regressors. For this issue, this paper presents an heterogeneous phoneme recognition framework which can further benefit the phoneme pronunciation diagnostic tasks by combining the partial least square with support vector machines. A French phoneme data set containing 4830 samples is established for the evaluation experiments. The experiments of this paper demonstrates that the new method improves the accuracy performance of the phoneme classifiers by 0.21 - 8.47% comparing to state-of-the-arts with different data training data density.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Deep Learning*
Humans
Language*
Learning*
Least-Squares Analysis
Speech Acoustics
Speech Perception
Support Vector Machine*
Vocal Cords

Grants and funding

This work is funded by Chinese Academy of Sciences.