DigChem: Identification of disease-gene-chemical relationships from Medline abstracts

PLoS Comput Biol. 2019 May 15;15(5):e1007022. doi: 10.1371/journal.pcbi.1007022. eCollection 2019 May.

Abstract

Chemicals interact with genes in the process of disease development and treatment. Although much biomedical research has been performed to understand relationships among genes, chemicals, and diseases, which have been reported in biomedical articles in Medline, there are few studies that extract disease-gene-chemical relationships from biomedical literature at a PubMed scale. In this study, we propose a deep learning model based on bidirectional long short-term memory to identify the evidence sentences of relationships among genes, chemicals, and diseases from Medline abstracts. Then, we develop the search engine DigChem to enable disease-gene-chemical relationship searches for 35,124 genes, 56,382 chemicals, and 5,675 diseases. We show that the identified relationships are reliable by comparing them with manual curation and existing databases. DigChem is available at http://gcancer.org/digchem.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Abstracting and Indexing
  • Chemically-Induced Disorders / etiology*
  • Chemically-Induced Disorders / genetics*
  • Computational Biology
  • Data Mining
  • Databases, Factual
  • Databases, Genetic
  • Deep Learning
  • Disease / etiology*
  • Disease / genetics*
  • Female
  • Humans
  • MEDLINE
  • Male
  • Neural Networks, Computer
  • PubMed
  • Search Engine*

Grants and funding

This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (NRF-2016R1A2B2013855 and NRF-2018M3C7A1054935), and GIST Research Institute (GRI) grant funded by the GIST in 2018. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.