A methodology for extending domain coverage in SemRep

J Biomed Inform. 2013 Dec;46(6):1099-107. doi: 10.1016/j.jbi.2013.08.005. Epub 2013 Aug 21.

Abstract

We describe a domain-independent methodology to extend SemRep coverage beyond the biomedical domain. SemRep, a natural language processing application originally designed for biomedical texts, uses the knowledge sources provided by the Unified Medical Language System (UMLS©). Ontological and terminological extensions to the system are needed in order to support other areas of knowledge. We extended SemRep's application by developing a semantic representation of a previously unsupported domain. This was achieved by adapting well-known ontology engineering phases and integrating them with the UMLS knowledge sources on which SemRep crucially depends. While the process to extend SemRep coverage has been successfully applied in earlier projects, this paper presents in detail the step-wise approach we followed and the mechanisms implemented. A case study in the field of medical informatics illustrates how the ontology engineering phases have been adapted for optimal integration with the UMLS. We provide qualitative and quantitative results, which indicate the validity and usefulness of our methodology.

Keywords: Domain-independent ontology development methodology; Natural language processing application; Semantic predications; UMLS knowledge sources.

Publication types

  • Research Support, N.I.H., Intramural

MeSH terms

  • Information Storage and Retrieval
  • Natural Language Processing*
  • Semantics*
  • Unified Medical Language System