Parkinson disease classification using one against all based data sampling with the acoustic features from the speech signals

Med Hypotheses. 2020 Mar 16:140:109678. doi: 10.1016/j.mehy.2020.109678. Online ahead of print.

Abstract

Parkinson's disease (PD) is a long-term degenerative disease that primarily affects the motor system of the central nervous system. This disease is difficult to diagnose and is one of the common diseases in the public. In this paper, we have proposed a novel data sampling method for the classification of Parkinson disease based on the acoustic features from the speech signals. In the proposed data sampling method, the one against all (OGA) has been used to divide the dataset into five equal parts. With applying the OGA to the PD dataset having two classes (healthy and Parkinson disease), the minority and majority classes have been obtained. First of all, for healthy class in the dataset (first case), five equal partitions have been composed and then for PD class in the dataset (second case), five equal partitions have been composed. To classify the these all data partitions, we have used three different classifiers including the weighted k-NN (nearest neighbor), Logistic Regression (LR), and support vector machine with medium Gaussian kernel function. In order to evaluate the performance of the proposed hybrid models (the combination of classifiers and OGA based data sampling), the classification accuracy, the confusion matrix, and area under the Receiver Operating Characteristic (ROC) curve (AUC) have been used. While the LR, SVM with Gaussian, and weighted k-NN classifiers achieved the classification accuracies of 77.50%, 83.80%, and 82.10% in the classification of PD with the acoustic features, the combinations of classifiers and OGA based data sampling (first case) obtained the 79.04%, 87.36%, and 88.48% using the LR, SVM with Gaussian, and weighted k-NN classifiers, respectively. In the second case, the obtained classification accuracies are the 84.30%, 88.76%, and 89.46% using the LR, SVM with Gaussian, and weighted k-NN classifiers with the OGA based data sampling, respectively. The achieved results have shown that the proposed the one against all (OGA) based data sampling could be used in the combination of classifier algorithms as the data pre-processing method in the classification of Parkinson's disease with acoustic features.

Keywords: Acoustic features; Hybrid models; One against all (OGA) based data sampling; Parkinson's disease (PD) classification.