Bidirectional incremental parsing for automatic pathway identification with combinatory categorial grammar

Pac Symp Biocomput. 2001:396-407.

Abstract

As the importance of automatically extracting and analyzing various natural language assertions about protein-protein interactions in biomedical publications is recognized, many uses of natural language processing techniques are proposed in the literature. However, most proposals to date make rather simplifying assumptions about the syntactic aspects of natural language due to various reasons including efficiency. In this paper, we describe an implemented system that utilizes combinatory categorical grammar known to be competent in modeling natural language, with a controlled mechanism for the parser to operate bidirectionally and incrementally. We discuss the performance of the system on a large set of abstracts in Medline with quite encouraging results.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Data Interpretation, Statistical
  • Natural Language Processing*
  • Proteins / metabolism
  • Terminology as Topic

Substances

  • Proteins