Understanding Severe Asthma Through Small and Big Data in Spanish Hospitals: The PAGE Study

J Investig Allergol Clin Immunol. 2023 Oct 16;33(5):373-382. doi: 10.18176/jiaci.0848. Epub 2023 Aug 24.

Abstract

Background: Data on the prevalence of severe asthma (SA) are limited. Electronic health records (EHRs) offer a unique research opportunity to test machine learning (ML) tools in epidemiological studies. Our aim was to estimate the prevalence of SA among asthma patients seen in hospital asthma units, using both ML-based and traditional research methodologies. Our secondary objective was to describe patients with nonsevere asthma (NSA) and SA over a follow-up of 12 months.

Methods: PAGE is a multicenter, controlled, observational study conducted in 36 Spanish hospitals and split into 2 phases: a cross-sectional phase for estimation of the prevalence of SA and a prospective phase (3 visits in 12 months) for the follow-up and characterization of SA and NSA patients. A substudy with ML was performed in 6 hospitals. Our ML tool uses EHRead technology, which extracts clinical concepts from EHRs and standardizes them to SNOMED CT.

Results: The prevalence of SA among asthma patients in Spanish hospitals was 20.1%, compared with 9.7% using the ML tool. The proportion of SA phenotypes and the features of patients followed up were consistent with previous studies. The clinical predictions of patients' clinical course were unreliable, and ML found only 2 predictive models with discriminatory power to predict outcomes.

Conclusion: This study is the first to estimate the prevalence of SA in hospitalized asthma patients and to predict patient outcomes using both standard and ML-based research techniques. Our findings offer relevant insights for further epidemiological and clinical research in SA.

Keywords: Big data; Machine learning; Natural language processing; Predictive models; Prevalence; Severe asthma.