Early identification of patients at risk for iron-deficiency anemia using deep learning techniques

Am J Clin Pathol. 2024 Apr 20:aqae031. doi: 10.1093/ajcp/aqae031. Online ahead of print.

Abstract

Objectives: Iron-deficiency anemia (IDA) is a common health problem worldwide, and up to 10% of adult patients with incidental IDA may have gastrointestinal cancer. A diagnosis of IDA can be established through a combination of laboratory tests, but it is often underrecognized until a patient becomes symptomatic. Based on advances in machine learning, we hypothesized that we could reduce the time to diagnosis by developing an IDA prediction model. Our goal was to develop 3 neural networks by using retrospective longitudinal outpatient laboratory data to predict the risk of IDA 3 to 6 months before traditional diagnosis.

Methods: We analyzed retrospective outpatient electronic health record data between 2009 and 2020 from an academic medical center in northern Texas. We included laboratory features from 30,603 patients to develop 3 types of neural networks: artificial neural networks, long short-term memory cells, and gated recurrent units. The classifiers were trained using the Adam Optimizer across 200 random training-validation splits. We calculated accuracy, area under the receiving operating characteristic curve, sensitivity, and specificity in the testing split.

Results: Although all models demonstrated comparable performance, the gated recurrent unit model outperformed the other 2, achieving an accuracy of 0.83, an area under the receiving operating characteristic curve of 0.89, a sensitivity of 0.75, and a specificity of 0.85 across 200 epochs.

Conclusions: Our results showcase the feasibility of employing deep learning techniques for early prediction of IDA in the outpatient setting based on sequences of laboratory data, offering a substantial lead time for clinical intervention.

Keywords: artificial neural networks; deep learning; diagnosis; early diagnosis; gated recurrent unit; iron-deficiency anemia; long short-term memory; machine learning.