Large-scale deep learning analysis to identify adult patients at risk for combined and common variable immunodeficiencies

Commun Med (Lond). 2023 Dec 20;3(1):189. doi: 10.1038/s43856-023-00412-8.

Abstract

Background: Primary immunodeficiency (PI) is a group of heterogeneous disorders resulting from immune system defects. Over 70% of PI is undiagnosed, leading to increased mortality, co-morbidity and healthcare costs. Among PI disorders, combined immunodeficiencies (CID) are characterized by complex immune defects. Common variable immunodeficiency (CVID) is among the most common types of PI. In light of available treatments, it is critical to identify adult patients at risk for CID and CVID, before the development of serious morbidity and mortality.

Methods: We developed a deep learning-based method (named "TabMLPNet") to analyze clinical history from nationally representative medical claims from electronic health records (Optum® data, covering all US), evaluated in the setting of identifying CID/CVID in adults. Further, we revealed the most important CID/CVID-associated antecedent phenotype combinations. Four large cohorts were generated: a total of 47,660 PI cases and (1:1 matched) controls.

Results: The sensitivity/specificity of TabMLPNet modeling ranges from 0.82-0.88/0.82-0.85 across cohorts. Distinctive combinations of antecedent phenotypes associated with CID/CVID are identified, consisting of respiratory infections/conditions, genetic anomalies, cardiac defects, autoimmune diseases, blood disorders and malignancies, which can possibly be useful to systematize the identification of CID and CVID.

Conclusions: We demonstrated an accurate method in terms of CID and CVID detection evaluated on large-scale medical claims data. Our predictive scheme can potentially lead to the development of new clinical insights and expanded guidelines for identification of adult patients at risk for CID and CVID as well as be used to improve patient outcomes on population level.

Plain language summary

Primary immunodeficiencies (PI) are disorders that weaken the immune system, increasing the incident of life-threatening infections, organ damage and the development of cancer and autoimmune diseases. Although PI is estimated to affect 1-2% of the global population, 70-90% of these patients remain undiagnosed. Many patients are diagnosed during adulthood, after other serious diseases have already developed. We developed a computational method to analyze the clinical history from a large group of people with and without PI. We focused on combined (CID) and common variable immunodeficiency (CVID), which are among the least studied and most common PI subtypes, respectively. We could identify people with CID or CVID and combinations of diseases and symptoms which could make it easier to identify CID or CVID. Our method could be used to more readily identify adults at risk of CID or CVID, enabling treatment to start earlier and their long-term health to be improved.