DEEPEN: A negation detection system for clinical text incorporating dependency relation into NegEx

Saeed Mehrabi; Anand Krishnan; Sunghwan Sohn; Alexandra M Roch; Heidi Schmidt; Joe Kesterson; Chris Beesley; Paul Dexter; C Max Schmidt; Hongfang Liu; Mathew Palakal

doi:10.1016/j.jbi.2015.02.010

DEEPEN: A negation detection system for clinical text incorporating dependency relation into NegEx

J Biomed Inform. 2015 Apr:54:213-9. doi: 10.1016/j.jbi.2015.02.010. Epub 2015 Mar 16.

Authors

Saeed Mehrabi¹, Anand Krishnan², Sunghwan Sohn³, Alexandra M Roch⁴, Heidi Schmidt⁴, Joe Kesterson⁵, Chris Beesley⁵, Paul Dexter⁵, C Max Schmidt⁴, Hongfang Liu⁶, Mathew Palakal⁷

Affiliations

¹ School of Informatics and Computing, Indiana University, Indianapolis, IN, USA; Department of Health Sciences Research, Mayo Clinic, Rochester, MN, USA.
² School of Informatics and Computing, Indiana University, Indianapolis, IN, USA.
³ Department of Health Sciences Research, Mayo Clinic, Rochester, MN, USA.
⁴ Department of Surgery, Indiana University, Indianapolis, IN, USA.
⁵ Regenstrief Institute, Indianapolis, IN, USA.
⁶ Department of Health Sciences Research, Mayo Clinic, Rochester, MN, USA. Electronic address: liu.hongfang@mayo.edu.
⁷ School of Informatics and Computing, Indiana University, Indianapolis, IN, USA. Electronic address: mpalakal@iupui.edu.

Abstract

In Electronic Health Records (EHRs), much of valuable information regarding patients' conditions is embedded in free text format. Natural language processing (NLP) techniques have been developed to extract clinical information from free text. One challenge faced in clinical NLP is that the meaning of clinical entities is heavily affected by modifiers such as negation. A negation detection algorithm, NegEx, applies a simplistic approach that has been shown to be powerful in clinical NLP. However, due to the failure to consider the contextual relationship between words within a sentence, NegEx fails to correctly capture the negation status of concepts in complex sentences. Incorrect negation assignment could cause inaccurate diagnosis of patients' condition or contaminated study cohorts. We developed a negation algorithm called DEEPEN to decrease NegEx's false positives by taking into account the dependency relationship between negation words and concepts within a sentence using Stanford dependency parser. The system was developed and tested using EHR data from Indiana University (IU) and it was further evaluated on Mayo Clinic dataset to assess its generalizability. The evaluation results demonstrate DEEPEN, which incorporates dependency parsing into NegEx, can reduce the number of incorrect negation assignment for patients with positive findings, and therefore improve the identification of patients with the target clinical findings in EHRs.

Keywords: Dependency parser; Natural language processing; Negation.

Publication types

Research Support, N.I.H., Extramural
Research Support, U.S. Gov't, P.H.S.

MeSH terms

Algorithms*
Electronic Health Records*
Humans
Natural Language Processing*

Abstract

Publication types

MeSH terms

Grants and funding