Mining relational paths in integrated biomedical data

PLoS One. 2011;6(12):e27506. doi: 10.1371/journal.pone.0027506. Epub 2011 Dec 6.

Abstract

Much life science and biology research requires an understanding of complex relationships between biological entities (genes, compounds, pathways, diseases, and so on). There is a wealth of data on such relationships in publicly available datasets and publications, but these sources are overlapped and distributed so that finding pertinent relational data is increasingly difficult. Whilst most public datasets have associated tools for searching, there is a lack of searching methods that can cross data sources and that in particular search not only based on the biological entities themselves but also on the relationships between them. In this paper, we demonstrate how graph-theoretic algorithms for mining relational paths can be used together with a previous integrative data resource we developed called Chem2Bio2RDF to extract new biological insights about the relationships between such entities. In particular, we use these methods to investigate the genetic basis of side-effects of thiazolinedione drugs, and in particular make a hypothesis for the recently discovered cardiac side-effects of Rosiglitazone (Avandia) and a prediction for Pioglitazone which is backed up by recent clinical studies.

MeSH terms

  • Algorithms
  • Computers
  • Data Collection
  • Data Mining / methods*
  • Databases, Factual
  • Humans
  • Hypoglycemic Agents / adverse effects
  • Ibuprofen / adverse effects
  • Medical Informatics / methods*
  • Models, Statistical
  • Myocardial Infarction / chemically induced
  • Parkinson Disease / etiology
  • Pioglitazone
  • Rosiglitazone
  • Software
  • Thiazolidinediones / adverse effects

Substances

  • Hypoglycemic Agents
  • Thiazolidinediones
  • Rosiglitazone
  • Ibuprofen
  • Pioglitazone