Natural Language Processing for Asthma Ascertainment in Different Practice Settings

J Allergy Clin Immunol Pract. 2018 Jan-Feb;6(1):126-131. doi: 10.1016/j.jaip.2017.04.041. Epub 2017 Jun 19.

Abstract

Background: We developed and validated NLP-PAC, a natural language processing (NLP) algorithm based on predetermined asthma criteria (PAC) for asthma ascertainment using electronic health records at Mayo Clinic.

Objective: To adapt NLP-PAC in a different health care setting, Sanford Children Hospital, by assessing its external validity.

Methods: The study was designed as a retrospective cohort study that used a random sample of 2011-2012 Sanford Birth cohort (n = 595). Manual chart review was performed on the cohort for asthma ascertainment on the basis of the PAC. We then used half of the cohort as a training cohort (n = 298) and the other half as a blind test cohort to evaluate the adapted NLP-PAC algorithm. Association of known asthma-related risk factors with the Sanford-NLP algorithm-driven asthma ascertainment was tested.

Results: Among the eligible test cohort (n = 297), 160 (53%) were males, 268 (90%) white, and the median age was 2.3 years (range, 1.5-3.1 years). NLP-PAC, after adaptation, and the human abstractor identified 74 (25%) and 72 (24%) subjects, respectively, with 66 subjects identified by both approaches. Sensitivity, specificity, positive predictive value, and negative predictive value for the NLP algorithm in predicting asthma status were 92%, 96%, 89%, and 97%, respectively. The known risk factors for asthma identified by NLP (eg, smoking history) were similar to the ones identified by manual chart review.

Conclusions: Successful implementation of NLP-PAC for asthma ascertainment in 2 different practice settings demonstrates the feasibility of automated asthma ascertainment leveraging electronic health record data with a potential to enable large-scale, multisite asthma studies to improve asthma care and research.

Keywords: Algorithm adaptability; Asthma ascertainment; Electronic health records; Epidemiology; Informatics; Natural language processing; Retrospective study; Validation.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Asthma / diagnosis*
  • Child, Preschool
  • Cigarette Smoking / adverse effects
  • Cohort Studies
  • Delivery of Health Care*
  • Electronic Health Records
  • Feasibility Studies
  • Female
  • Humans
  • Infant
  • Male
  • Natural Language Processing*
  • Predictive Value of Tests
  • Prognosis
  • Retrospective Studies
  • Risk Factors
  • Sensitivity and Specificity