The clinical feasibility of deep learning-based classification of amyloid PET images in visually equivocal cases

Eur J Nucl Med Mol Imaging. 2020 Feb;47(2):332-341. doi: 10.1007/s00259-019-04595-y. Epub 2019 Dec 6.

Abstract

Purpose: Although most deep learning (DL) studies have reported excellent classification accuracy, these studies usually target typical Alzheimer's disease (AD) and normal cognition (NC) for which conventional visual assessment performs well. A clinically relevant issue is the selection of high-risk subjects who need active surveillance among equivocal cases. We validated the clinical feasibility of DL compared with visual rating or quantitative measurement for assessing the diagnosis and prognosis of subjects with equivocal amyloid scans.

Methods: 18F-florbetaben scans of 430 cases (85 NC, 233 mild cognitive impairment, and 112 AD) were assessed through visual rating-based, quantification-based, and DL-based methods. DL was trained using 280 two-dimensional PET images (80%) and tested by randomly assigning the remaining (70 cases, 20%) cases and a clinical validation set of 54 equivocal cases. In the equivocal cases, we assessed the agreement among the visual rating, quantification, and DL and compared the clinical outcome according to each modality-based amyloid status.

Results: The visual reading was positive in 175 cases, equivocal in 54 cases, and negative in 201 cases. The composite SUVR cutoff value was 1.32 (AUC 0.99). The subject-level performance of DL using the test set was 100%. Among the 54 equivocal cases, 37 cases were classified as positive (Eq(deep+)) by DL, 40 cases were classified by a second-round visual assessment, and 40 cases were classified by quantification. The DL- and quantification-based classifications showed good agreement (83%, κ = 0.59). The composite SUVRs differed between Eq(deep+) (1.47 [0.13]) and Eq(deep-) (1.29 [0.10]; P < 0.001). DL, but not the visual rating, showed a significant difference in the Mini-Mental Status Examination score change during the follow-up between Eq(deep+) (- 4.21 [0.57]) and Eq(deep-) (- 1.74 [0.76]; P = 0.023) (mean duration, 1.76 years).

Conclusions: In visually equivocal scans, DL was more related to quantification than to visual assessment, and the negative cases selected by DL showed no decline in cognitive outcome. DL is useful for clinical diagnosis and prognosis assessment in subjects with visually equivocal amyloid scans.

Keywords: 18F-florbetaben PET; Alzheimer’s disease; Amyloid; Deep learning; Equivocal scan.

Publication types

  • Clinical Study
  • Research Support, Non-U.S. Gov't
  • Validation Study

MeSH terms

  • Alzheimer Disease* / diagnostic imaging
  • Amyloid
  • Amyloid beta-Peptides
  • Aniline Compounds
  • Deep Learning*
  • Feasibility Studies
  • Humans
  • Positron-Emission Tomography

Substances

  • Amyloid
  • Amyloid beta-Peptides
  • Aniline Compounds