Self-Supervised Feature Learning and Phenotyping for Assessing Age-Related Macular Degeneration Using Retinal Fundus Images

Ophthalmol Retina. 2022 Feb;6(2):116-129. doi: 10.1016/j.oret.2021.06.010. Epub 2021 Jul 2.

Abstract

Objective: Diseases such as age-related macular degeneration (AMD) are classified based on human rubrics that are prone to bias. Supervised neural networks trained using human-generated labels require labor-intensive annotations and are restricted to specific trained tasks. Here, we trained a self-supervised deep learning network using unlabeled fundus images, enabling data-driven feature classification of AMD severity and discovery of ocular phenotypes.

Design: Development of a self-supervised training pipeline to evaluate fundus photographs from the Age-Related Eye Disease Study (AREDS).

Participants: One hundred thousand eight hundred forty-eight human-graded fundus images from 4757 AREDS participants between 55 and 80 years of age.

Methods: We trained a deep neural network with self-supervised Non-Parametric Instance Discrimination (NPID) using AREDS fundus images without labels then evaluated its performance in grading AMD severity using 2-step, 4-step, and 9-step classification schemes using a supervised classifier. We compared balanced and unbalanced accuracies of NPID against supervised-trained networks and ophthalmologists, explored network behavior using hierarchical learning of image subsets and spherical k-means clustering of feature vectors, then searched for ocular features that can be identified without labels.

Main outcome measures: Accuracy and kappa statistics.

Results: NPID demonstrated versatility across different AMD classification schemes without re-training and achieved balanced accuracies comparable with those of supervised-trained networks or human ophthalmologists in classifying advanced AMD (82% vs. 81-92% or 89%), referable AMD (87% vs. 90-92% or 96%), or on the 4-step AMD severity scale (65% vs. 63-75% or 67%), despite never directly using these labels during self-supervised feature learning. Drusen area drove network predictions on the 4-step scale, while depigmentation and geographic atrophy (GA) areas correlated with advanced AMD classes. Self-supervised learning revealed grader-mislabeled images and susceptibility of some classes within more granular AMD scales to misclassification by both ophthalmologists and neural networks. Importantly, self-supervised learning enabled data-driven discovery of AMD features such as GA and other ocular phenotypes of the choroid (e.g., tessellated or blonde fundi), vitreous (e.g., asteroid hyalosis), and lens (e.g., nuclear cataracts) that were not predefined by human labels.

Conclusions: Self-supervised learning enables AMD severity grading comparable with that of ophthalmologists and supervised networks, reveals biases of human-defined AMD classification systems, and allows unbiased, data-driven discovery of AMD and non-AMD ocular phenotypes.

Keywords: AMD; Age-related macular degeneration; Artificial intelligence; Deep learning; Machine learning.

Publication types

  • Multicenter Study
  • Randomized Controlled Trial
  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Aged
  • Aged, 80 and over
  • Algorithms*
  • Deep Learning*
  • Female
  • Fluorescein Angiography / methods
  • Follow-Up Studies
  • Fundus Oculi
  • Humans
  • Macular Degeneration / diagnosis*
  • Male
  • Middle Aged
  • Neural Networks, Computer*
  • Prospective Studies
  • Reproducibility of Results
  • Retina / diagnostic imaging*
  • Severity of Illness Index
  • Tomography, Optical Coherence / methods