Assessing longitudinal housing status using Electronic Health Record data: a comparison of natural language processing, structured data, and patient-reported history

Front Artif Intell. 2023 May 24:6:1187501. doi: 10.3389/frai.2023.1187501. eCollection 2023.

Abstract

Introduction: Measuring long-term housing outcomes is important for evaluating the impacts of services for individuals with homeless experience. However, assessing long-term housing status using traditional methods is challenging. The Veterans Affairs (VA) Electronic Health Record (EHR) provides detailed data for a large population of patients with homeless experiences and contains several indicators of housing instability, including structured data elements (e.g., diagnosis codes) and free-text clinical narratives. However, the validity of each of these data elements for measuring housing stability over time is not well-studied.

Methods: We compared VA EHR indicators of housing instability, including information extracted from clinical notes using natural language processing (NLP), with patient-reported housing outcomes in a cohort of homeless-experienced Veterans.

Results: NLP achieved higher sensitivity and specificity than standard diagnosis codes for detecting episodes of unstable housing. Other structured data elements in the VA EHR showed promising performance, particularly when combined with NLP.

Discussion: Evaluation efforts and research studies assessing longitudinal housing outcomes should incorporate multiple data sources of documentation to achieve optimal performance.

Keywords: electronic health records; homelessness; natural language processing; social determinants of health; veterans affairs.

Grants and funding

This study was supported by QUERI-VISN (Quality Enhancement Research Initiative-Veterans Integrated Services Networks) Partnered Implementation Initiative (PII) 21-285 (Multiple Principal Investigators: Gabrielian, Cordasco, Finley).