A high-generalizability machine learning framework for predicting the progression of Alzheimer's disease using limited data

NPJ Digit Med. 2022 Apr 12;5(1):43. doi: 10.1038/s41746-022-00577-x.

Abstract

Alzheimer's disease is a neurodegenerative disease that imposes a substantial financial burden on society. A number of machine learning studies have been conducted to predict the speed of its progression, which varies widely among different individuals, for recruiting fast progressors in future clinical trials. However, because the data in this field are very limited, two problems have yet to be solved: the first is that models built on limited data tend to induce overfitting and have low generalizability, and the second is that no cross-cohort evaluations have been done. Here, to suppress the overfitting caused by limited data, we propose a hybrid machine learning framework consisting of multiple convolutional neural networks that automatically extract image features from the point of view of brain segments, which are relevant to cognitive decline according to clinical findings, and a linear support vector classifier that uses extracted image features together with non-image information to make robust final predictions. The experimental results indicate that our model achieves superior performance (accuracy: 0.88, area under the curve [AUC]: 0.95) compared with other state-of-the-art methods. Moreover, our framework demonstrates high generalizability as a result of evaluations using a completely different cohort dataset (accuracy: 0.84, AUC: 0.91) collected from a different population than that used for training.