Knowledge-Driven Drug-Use NamedEntity Recognition with Distant Supervision

Stud Health Technol Inform. 2022 Jun 6:290:140-144. doi: 10.3233/SHTI220048.

Abstract

As Named Entity Recognition (NER) has been essential in identifying critical elements of unstructured content, generic NER tools remain limited in recognizing entities specific to a domain, such as drug use and public health. For such high-impact areas, accurately capturing relevant entities at a more granular level is critical, as this information influences real-world processes. On the other hand, training NER models for a specific domain without handcrafted features requires an extensive amount of labeled data, which is expensive in human effort and time. In this study, we employ distant supervision utilizing a domain-specific ontology to reduce the need for human labor and train models incorporating domain-specific (e.g., drug use) external knowledge to recognize domain specific entities. We capture entities related the drug use and their trends in government epidemiology reports, with an improvement of 8% in F1-score.

Keywords: Deep Learning; Information Storage and Retrieval; Natural Language Processing.

MeSH terms

  • Humans
  • Information Storage and Retrieval*
  • Names*
  • Natural Language Processing