Applications of Machine Learning to Diagnosis of Parkinson's Disease

Hong Lai; Xu-Ying Li; Fanxi Xu; Junge Zhu; Xian Li; Yang Song; Xianlin Wang; Zhanjun Wang; Chaodong Wang

doi:10.3390/brainsci13111546

Applications of Machine Learning to Diagnosis of Parkinson's Disease

Brain Sci. 2023 Nov 3;13(11):1546. doi: 10.3390/brainsci13111546.

Authors

Hong Lai^{1

2}, Xu-Ying Li¹, Fanxi Xu¹, Junge Zhu¹, Xian Li¹, Yang Song¹, Xianlin Wang¹, Zhanjun Wang¹, Chaodong Wang¹

Affiliations

¹ Department of Neurology, Xuanwu Hospital of Capital Medical University, National Clinical Research Center for Geriatric Diseases, Beijing 100053, China.
² Department of Neurology, The First Affiliated Hospital of Gannan Medical University, Ganzhou 341000, China.

Abstract

Background: Accurate diagnosis of Parkinson's disease (PD) is challenging due to its diverse manifestations. Machine learning (ML) algorithms can improve diagnostic precision, but their generalizability across medical centers in China is underexplored.

Objective: To assess the accuracy of an ML algorithm for PD diagnosis, trained and tested on data from different medical centers in China.

Methods: A total of 1656 participants were included, with 1028 from Beijing (training set) and 628 from Fuzhou (external validation set). Models were trained using the least absolute shrinkage and selection operator-logistic regression (LASSO-LR), decision tree (DT), random forest (RF), eXtreme gradient boosting (XGboost), support vector machine (SVM), and k-nearest neighbor (KNN) techniques. Hyperparameters were optimized using five-fold cross-validation and grid search techniques. Model performance was evaluated using the area under the curve (AUC) of the receiver operating characteristic (ROC) curve, accuracy, sensitivity (recall), specificity, precision, and F1 score. Variable importance was assessed for all models.

Results: SVM demonstrated the best differentiation between healthy controls (HCs) and PD patients (AUC: 0.928, 95% CI: 0.908-0.947; accuracy: 0.844, 95% CI: 0.814-0.871; sensitivity: 0.826, 95% CI: 0.786-0.866; specificity: 0.861, 95% CI: 0.820-0.898; precision: 0.849, 95% CI: 0.807-0.891; F1 score: 0.837, 95% CI: 0.803-0.868) in the validation set. Constipation, olfactory decline, and daytime somnolence significantly influenced predictability.

Conclusion: We identified multiple pivotal variables and SVM as a precise and clinician-friendly ML algorithm for prediction of PD in Chinese patients.

Keywords: Parkinson’s disease; diagnostic accuracy; external validation; machine learning; support vector machine.

Abstract

Grants and funding