Rich features based Conditional Random Fields for biological named entities recognition

Comput Biol Med. 2007 Sep;37(9):1327-33. doi: 10.1016/j.compbiomed.2006.12.002. Epub 2007 Jan 19.

Abstract

Biological named entity recognition is a critical task for automatically mining knowledge from biological literature. In this paper, this task is cast as a sequential labeling problem and Conditional Random Fields model is introduced to solve it. Under the framework of Conditional Random Fields model, rich features including literal, context and semantics are involved. Among these features, shallow syntactic features are first introduced, which effectively improve the model's performance. Experiments show that our method can achieve an F-measure of 71.2% in an open evaluation data, which is better than most of state-of-the-art systems.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Artificial Intelligence
  • Biomedical Research / methods*
  • Information Storage and Retrieval / methods*
  • Information Systems*
  • MEDLINE
  • Models, Statistical*
  • Pattern Recognition, Automated / methods*
  • Terminology as Topic