MIDER: network inference with mutual information distance and entropy reduction

PLoS One. 2014 May 7;9(5):e96732. doi: 10.1371/journal.pone.0096732. eCollection 2014.

Abstract

The prediction of links among variables from a given dataset is a task referred to as network inference or reverse engineering. It is an open problem in bioinformatics and systems biology, as well as in other areas of science. Information theory, which uses concepts such as mutual information, provides a rigorous framework for addressing it. While a number of information-theoretic methods are already available, most of them focus on a particular type of problem, introducing assumptions that limit their generality. Furthermore, many of these methods lack a publicly available implementation. Here we present MIDER, a method for inferring network structures with information theoretic concepts. It consists of two steps: first, it provides a representation of the network in which the distance among nodes indicates their statistical closeness. Second, it refines the prediction of the existing links to distinguish between direct and indirect interactions and to assign directionality. The method accepts as input time-series data related to some quantitative features of the network nodes (such as e.g. concentrations, if the nodes are chemical species). It takes into account time delays between variables, and allows choosing among several definitions and normalizations of mutual information. It is general purpose: it may be applied to any type of network, cellular or otherwise. A Matlab implementation including source code and data is freely available (http://www.iim.csic.es/~gingproc/mider.html). The performance of MIDER has been evaluated on seven different benchmark problems that cover the main types of cellular networks, including metabolic, gene regulatory, and signaling. Comparisons with state of the art information-theoretic methods have demonstrated the competitive performance of MIDER, as well as its versatility. Its use does not demand any a priori knowledge from the user; the default settings and the adaptive nature of the method provide good results for a wide range of problems without requiring tuning.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Computational Biology / methods*
  • Entropy
  • Gene Regulatory Networks
  • Models, Theoretical*
  • Software
  • Systems Biology / methods*

Grants and funding

This work was supported by the EU project “BioPreDyn” (European Commission grant FP7-KBBE-2011-5/289434); the Spanish Ministerio de Economia y Competitividad (MINECO) projects DPI2011-28112-C04-03, BFU2009-12895-C02-02, and BFU2012-39816-C02-02; the CSIC intramural project “BioREDES” (PIE-201170E018); and the National Science Foundation grant CHE 0847073. Work in UCM is supported by grant BFU2012-39816-C02-02 from Spanish Ministry of Economy and Competitiveness (MINECO) and Consolider/Ingenio2010 CSD2007-00002 from Spanish MICINN. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.