Voice Disorder Identification by using Hilbert-Huang Transform (HHT) and K Nearest Neighbor (KNN)

Lili Chen; Chaoyu Wang; Junjiang Chen; Zejun Xiang; Xue Hu

doi:10.1016/j.jvoice.2020.03.009

Voice Disorder Identification by using Hilbert-Huang Transform (HHT) and K Nearest Neighbor (KNN)

J Voice. 2021 Nov;35(6):932.e1-932.e11. doi: 10.1016/j.jvoice.2020.03.009. Epub 2020 May 10.

Authors

Lili Chen¹, Chaoyu Wang¹, Junjiang Chen¹, Zejun Xiang², Xue Hu³

Affiliations

¹ School of Mechatronics and Vehicle Engineering, Chongqing Jiaotong University, Chongqing, China.
² Chongqing Survey Institute, Chongqing, China.
³ The Department of Blood Transfusion, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China. Electronic address: huxue@hospital.cqmu.edu.cn.

PMID: 32402664
DOI: 10.1016/j.jvoice.2020.03.009

Abstract

Objectives: Clinical evaluation of dysphonic voices involves a multidimensional approach, including a variety of instrumental and noninstrumental measures. Acoustic analyses provide an objective, noninvasive and intelligent measures of voice quality. Based on sound recordings, this paper proposes a new classification method of voice disorders with HHT and KNN.

Methods: In this research, 12 features of each sample is calculated by HHT. Based on the algorithm of Linear Prediction Coefficient (LPCC), a sample can be characterized by 9 features. After each sample is expressed by 21 features, the classifier is constructed based on KNN. In addition, classifier based on KNN was further compared with random forest and extra trees classifiers in relation to their classification performance of voice disorder.

Results: The experiment results revel that classifier based on KNN showed better performance than other two classifiers with accuracy rate of 93.3%, precision of 93%, recall rate of 95%, F1-score of 94% and the area of receiver operating characteristic curve is 0.976.

Conclusions: The method put forward in this paper can be effectively used to classify voice disorders.

Keywords: Hilbert-Huang transform; K nearest Neighbor; Linear Prediction Coefficient; Voice disorders.

MeSH terms

Algorithms
Cluster Analysis
Humans
ROC Curve
Sound Recordings*
Voice Disorders* / diagnosis