Machine learning natural language processing for identifying venous thromboembolism: Systematic review and meta-analysis

Barbara D Lam; Pavlina Chrysafi; Thita Chiasakul; Harshit Khosla; Dimitra Karagkouni; Megan McNichol; Alys Adamski; Nimia Reyes; Karon Abe; Simon Mantha; Ioannis S Vlachos; Jeffrey I Zwicker; Rushad Patell

doi:10.1182/bloodadvances.2023012200

Machine learning natural language processing for identifying venous thromboembolism: Systematic review and meta-analysis

Blood Adv. 2024 Mar 24:bloodadvances.2023012200. doi: 10.1182/bloodadvances.2023012200. Online ahead of print.

Authors

Affiliations

¹ Beth Israel Deaconess Medical Center, Boston, Massachusetts, United States.
² Mount Auburn Hospital, Cambridge, Massachusetts, United States.
³ Chulalongkorn University and King Chulalongkorn Memorial Hospital, Thai Red C, Bangkok, Thailand.
⁴ Saint Vincent Hospital, Worcester, Massachusetts, United States.
⁵ Centers for Disease Control and Prevention, Atlanta, Georgia, United States.
⁶ CDC, Atlanta, Georgia, United States.
⁷ Memorial Sloan Kettering Cancer Center, New York, New York, United States.
⁸ Beth Israel Deaconess Medical Center / Harvard Medical School, Boston, Massachusetts, United States.

PMID: 38522096
DOI: 10.1182/bloodadvances.2023012200

Abstract

Venous thromboembolism (VTE) is a leading cause of preventable in-hospital mortality. Monitoring VTE cases is limited by the challenges of manual chart review and diagnosis code interpretation. Natural language processing (NLP) can automate the process. Rule-based NLP methods are effective but time consuming. Machine learning (ML)-NLP methods present a promising solution. We conducted a systematic review and meta-analysis of studies published before May 2023 that use ML-NLP to identify VTE diagnoses in the electronic health records. Four reviewers screened all manuscripts, excluding studies that only used a rule-based method. A meta-analysis evaluated the pooled performance of each study's best performing model that evaluated for pulmonary embolism (PE) and/or deep vein thrombosis (DVT). Pooled sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) with confidence interval (CI) were calculated by DerSimonian and Laird method using a random-effects model. Study quality was assessed using an adapted TRIPOD tool. Thirteen studies were included in the systematic review and 8 had data available for meta-analysis. Pooled sensitivity was 0.931 (95% CI 0.881-0.962), specificity 0.984 (95% CI 0.967-0.992), PPV 0.910 (95% CI 0.865-0.941) and NPV 0.985 (95% CI 0.977-0.990). All studies met at least 13 of the 21 NLP-modified TRIPOD items, demonstrating fair quality. The highest performing models used vectorization rather than bag-of-words, and deep learning techniques such as convolutional neural networks. There was significant heterogeneity in the studies and only four validated their model on an external dataset. Further standardization of ML studies can help progress this novel technology towards real-world implementation.