A Datasheet for the INSIGHT Birmingham, Solihull, and Black Country Diabetic Retinopathy Screening Dataset

Ophthalmol Sci. 2023 Feb 26;3(3):100293. doi: 10.1016/j.xops.2023.100293. eCollection 2023 Sep.

Abstract

Purpose: Diabetic retinopathy (DR) is the most common microvascular complication associated with diabetes mellitus (DM), affecting approximately 40% of this patient population. Early detection of DR is vital to ensure monitoring of disease progression and prompt sight saving treatments as required. This article describes the data contained within the INSIGHT Birmingham, Solihull, and Black Country Diabetic Retinopathy Dataset.

Design: Dataset descriptor for routinely collected eye screening data.

Participants: All diabetic patients aged 12 years and older, attending annual digital retinal photography-based screening within the Birmingham, Solihull, and Black Country Eye Screening Programme.

Methods: The INSIGHT Health Data Research Hub for Eye Health is a National Health Service (NHS)-led ophthalmic bioresource that provides researchers with safe access to anonymized, routinely collected data from contributing NHS hospitals to advance research for patient benefit. This report describes the INSIGHT Birmingham, Solihull, and Black Country DR Screening Dataset, a dataset of anonymized images and linked screening data derived from the United Kingdom's largest regional DR screening program.

Main outcome measures: This dataset consists of routinely collected data from the eye screening program. The data primarily include retinal photographs with the associated DR grading data. Additional data such as corresponding demographic details, information regarding patients' diabetic status, and visual acuity data are also available. Further details regarding available data points are available in the supplementary information, in addition to the INSIGHT webpage included below.

Results: At the time point of this analysis (December 31, 2019), the dataset comprised 6 202 161 images from 246 180 patients, with a dataset inception date of January 1, 2007. The dataset includes 1 360 547 grading episodes between R0M0 and R3M1.

Conclusions: This dataset descriptor article summarizes the content of the dataset, how it has been curated, and what its potential uses are. Data are available through a structured application process for research studies that support discovery, clinical evidence analyses, and innovation in artificial intelligence technologies for patient benefit. Further information regarding the data repository and contact details can be found at https://www.insight.hdrhub.org/.

Financial disclosures: Proprietary or commercial disclosure may be found after the references.

Keywords: Biomedical data; Dataset; Diabetes mellitus; Diabetic retinopathy; Imaging.