Exploiting syntactic and semantics information for chemical-disease relation extraction

Database (Oxford). 2016 Apr 14:2016:baw048. doi: 10.1093/database/baw048. Print 2016.

Abstract

Identifying chemical-disease relations (CDR) from biomedical literature could improve chemical safety and toxicity studies. This article proposes a novel syntactic and semantic information exploitation method for CDR extraction. The proposed method consists of a feature-based model, a tree kernel-based model and a neural network model. The feature-based model exploits lexical features, the tree kernel-based model captures syntactic structure features, and the neural network model generates semantic representations. The motivation of our method is to fully utilize the nice properties of the three models to explore diverse information for CDR extraction. Experiments on the BioCreative V CDR dataset show that the three models are all effective for CDR extraction, and their combination could further improve extraction performance.Database URL:http://www.biocreative.org/resources/corpora/biocreative-v-cdr-corpus/.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Computational Biology / methods*
  • Data Mining / methods*
  • Databases, Factual*
  • Disease / etiology*
  • Hazardous Substances / toxicity*
  • Humans
  • Internet
  • Neural Networks, Computer
  • Semantics*

Substances

  • Hazardous Substances