Exploiting syntactic and semantics information for chemical-disease relation extraction

Huiwei Zhou; Huijie Deng; Long Chen; Yunlong Yang; Chen Jia; Degen Huang

doi:10.1093/database/baw048

Exploiting syntactic and semantics information for chemical-disease relation extraction

Database (Oxford). 2016 Apr 14:2016:baw048. doi: 10.1093/database/baw048. Print 2016.

Authors

Huiwei Zhou¹, Huijie Deng², Long Chen², Yunlong Yang², Chen Jia², Degen Huang²

Affiliations

¹ School of Computer Science and Technology, Dalian University of Technology, Dalian 116024, People's Republic of China zhouhuiwei@dlut.edu.cn.
² School of Computer Science and Technology, Dalian University of Technology, Dalian 116024, People's Republic of China.

Abstract

Identifying chemical-disease relations (CDR) from biomedical literature could improve chemical safety and toxicity studies. This article proposes a novel syntactic and semantic information exploitation method for CDR extraction. The proposed method consists of a feature-based model, a tree kernel-based model and a neural network model. The feature-based model exploits lexical features, the tree kernel-based model captures syntactic structure features, and the neural network model generates semantic representations. The motivation of our method is to fully utilize the nice properties of the three models to explore diverse information for CDR extraction. Experiments on the BioCreative V CDR dataset show that the three models are all effective for CDR extraction, and their combination could further improve extraction performance.Database URL:http://www.biocreative.org/resources/corpora/biocreative-v-cdr-corpus/.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Algorithms
Computational Biology / methods*
Data Mining / methods*
Databases, Factual*
Disease / etiology*
Hazardous Substances / toxicity*
Humans
Internet
Neural Networks, Computer
Semantics*

Substances

Hazardous Substances