Evidence of housing instability identified by addresses, clinical notes, and diagnostic codes in a real-world population with substance use disorders

J Clin Transl Sci. 2023 Sep 4;7(1):e196. doi: 10.1017/cts.2023.626. eCollection 2023.

Abstract

Introduction: Housing instability is a social determinant of health associated with multiple negative health outcomes including substance use disorders (SUDs). Real-world evidence of housing instability is needed to improve translational research on populations with SUDs.

Methods: We identified evidence of housing instability by leveraging structured diagnosis codes and unstructured clinical data from electronic health records of 20,556 patients from 2017 to 2021. We applied natural language processing with named-entity recognition and pattern matching to unstructured clinical notes with free-text documentation. Additionally, we analyzed semi-structured addresses containing explicit or implicit housing-related labels. We assessed agreement on identification methods by having three experts review of 300 records.

Results: Diagnostic codes only identified 58.5% of the population identifiable as having housing instability, whereas 41.5% are identifiable from addresses only (7.1%), clinical notes only (30.4%), or both (4.0%). Reviewers unanimously agreed on 79.7% of cases reviewed; a Fleiss' Kappa score of 0.35 suggested fair agreement yet emphasized the difficulty of analyzing patients having ambiguous housing situations. Among those with poisoning episodes related to stimulants or opioids, diagnosis codes were only able to identify 63.9% of those with housing instability.

Conclusions: All three data sources yield valid evidence of housing instability; each has their own inherent practical use and limitations. Translational researchers requiring comprehensive real-world evidence of housing instability should optimize and implement use of structured and unstructured data. Understanding the role of housing instability and temporary housing facilities is salient in populations with SUDs.

Keywords: Social determinants of health; geocoding; housing instability; natural language processing.