Automating Rey Complex Figure Test scoring using a deep learning-based approach: a potential large-scale screening tool for cognitive decline

Alzheimers Res Ther. 2023 Aug 30;15(1):145. doi: 10.1186/s13195-023-01283-w.

Abstract

Background: The Rey Complex Figure Test (RCFT) has been widely used to evaluate the neurocognitive functions in various clinical groups with a broad range of ages. However, despite its usefulness, the scoring method is as complex as the figure. Such a complicated scoring system can lead to the risk of reducing the extent of agreement among raters. Although several attempts have been made to use RCFT in clinical settings in a digitalized format, little attention has been given to develop direct automatic scoring that is comparable to experienced psychologists. Therefore, we aimed to develop an artificial intelligence (AI) scoring system for RCFT using a deep learning (DL) algorithm and confirmed its validity.

Methods: A total of 6680 subjects were enrolled in the Gwangju Alzheimer's and Related Dementia cohort registry, Korea, from January 2015 to June 2021. We obtained 20,040 scanned images using three images per subject (copy, immediate recall, and delayed recall) and scores rated by 32 experienced psychologists. We trained the automated scoring system using the DenseNet architecture. To increase the model performance, we improved the quality of training data by re-examining some images with poor results (mean absolute error (MAE) [Formula: see text] 5 [points]) and re-trained our model. Finally, we conducted an external validation with 150 images scored by five experienced psychologists.

Results: For fivefold cross-validation, our first model obtained MAE = 1.24 [points] and R-squared ([Formula: see text]) = 0.977. However, after evaluating and updating the model, the performance of the final model was improved (MAE = 0.95 [points], [Formula: see text] = 0.986). Predicted scores among cognitively normal, mild cognitive impairment, and dementia were significantly different. For the 150 independent test sets, the MAE and [Formula: see text] between AI and average scores by five human experts were 0.64 [points] and 0.994, respectively.

Conclusion: We concluded that there was no fundamental difference between the rating scores of experienced psychologists and those of our AI scoring system. We expect that our AI psychologist will be able to contribute to screen the early stages of Alzheimer's disease pathology in medical checkup centers or large-scale community-based research institutes in a faster and cost-effective way.

Keywords: Alzheimer’s disease; Artificial intelligence; Convolutional neural network; Deep learning; Rey Complex Figure Test; Scoring.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Alzheimer Disease* / diagnostic imaging
  • Artificial Intelligence
  • Cognitive Dysfunction* / diagnostic imaging
  • Deep Learning*
  • Humans