Grasping frequent subgraph mining for bioinformatics applications

BioData Min. 2018 Sep 3:11:20. doi: 10.1186/s13040-018-0181-9. eCollection 2018.

Abstract

Searching for interesting common subgraphs in graph data is a well-studied problem in data mining. Subgraph mining techniques focus on the discovery of patterns in graphs that exhibit a specific network structure that is deemed interesting within these data sets. The definition of which subgraphs are interesting and which are not is highly dependent on the application. These techniques have seen numerous applications and are able to tackle a range of biological research questions, spanning from the detection of common substructures in sets of biomolecular compounds, to the discovery of network motifs in large-scale molecular interaction networks. Thus far, information about the bioinformatics application of subgraph mining remains scattered over heterogeneous literature. In this review, we provide an introduction to subgraph mining for life scientists. We give an overview of various subgraph mining algorithms from a bioinformatics perspective and present several of their potential biomedical applications.

Keywords: Biological networks; Frequent subgraphs; Graph motifs; Pattern discovery; Pattern mining; Subgraph mining.

Publication types

  • Review