Automatic and Early Detection of Parkinson's Disease by Analyzing Acoustic Signals Using Classification Algorithms Based on Recursive Feature Elimination Method

Diagnostics (Basel). 2023 May 31;13(11):1924. doi: 10.3390/diagnostics13111924.

Abstract

Parkinson's disease (PD) is a neurodegenerative condition generated by the dysfunction of brain cells and their 60-80% inability to produce dopamine, an organic chemical responsible for controlling a person's movement. This condition causes PD symptoms to appear. Diagnosis involves many physical and psychological tests and specialist examinations of the patient's nervous system, which causes several issues. The methodology method of early diagnosis of PD is based on analysing voice disorders. This method extracts a set of features from a recording of the person's voice. Then machine-learning (ML) methods are used to analyse and diagnose the recorded voice to distinguish Parkinson's cases from healthy ones. This paper proposes novel techniques to optimize the techniques for early diagnosis of PD by evaluating selected features and hyperparameter tuning of ML algorithms for diagnosing PD based on voice disorders. The dataset was balanced by the synthetic minority oversampling technique (SMOTE) and features were arranged according to their contribution to the target characteristic by the recursive feature elimination (RFE) algorithm. We applied two algorithms, t-distributed stochastic neighbour embedding (t-SNE) and principal component analysis (PCA), to reduce the dimensions of the dataset. Both t-SNE and PCA finally fed the resulting features into the classifiers support-vector machine (SVM), K-nearest neighbours (KNN), decision tree (DT), random forest (RF), and multilayer perception (MLP). Experimental results proved that the proposed techniques were superior to existing studies in which RF with the t-SNE algorithm yielded an accuracy of 97%, precision of 96.50%, recall of 94%, and F1-score of 95%. In addition, MLP with the PCA algorithm yielded an accuracy of 98%, precision of 97.66%, recall of 96%, and F1-score of 96.66%.

Keywords: Parkinson’s disease; REF; coefficient of variation; exploratory data analysis; machine learning; t-SNE.