Datalog Extensions for Bioinformatic Data Analysis

Annu Int Conf IEEE Eng Med Biol Soc. 2018 Jul:2018:1303-1306. doi: 10.1109/EMBC.2018.8512571.

Abstract

Recent growth in public bioinformatic databases has facilitated the analysis of genomic and proteomic data. However, the large size of the datasets makes it hard for nonexpert programmers to perform the analysis. In this paper we present B-Log, a high-level query language for bioinformatic data analysis. Based on Datalog, B-Log can simply express graph analysis algorithms; it is extended with nested tables, recursive aggregations, and foreign functions, which helps quick exploratory analyses. We implemented several analysis algorithms in B-Log; we also implemented a prototype system to explore TCGA dataset. We find B-Log to be useful for exploratory analysis and quick prototyping.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Computational Biology*
  • Data Analysis
  • Databases, Factual
  • Proteomics