Intrinsic limitations in mainstream methods of identifying network motifs in biology

BMC Bioinformatics. 2020 Apr 29;21(1):165. doi: 10.1186/s12859-020-3441-x.

Abstract

Background: Network motifs are connectivity structures that occur with significantly higher frequency than chance, and are thought to play important roles in complex biological networks, for example in gene regulation, interactomes, and metabolomes. Network motifs may also become pivotal in the rational design and engineering of complex biological systems underpinning the field of synthetic biology. Distinguishing true motifs from arbitrary substructures, however, remains a challenge.

Results: Here we demonstrate both theoretically and empirically that implicit assumptions present in mainstream methods for motif identification do not necessarily hold, with the ramification that motif studies using these mainstream methods are less able to effectively differentiate between spurious results and events of true statistical significance than is often presented. We show that these difficulties cannot be overcome without revising the methods of statistical analysis used to identify motifs.

Conclusions: Present-day methods for the discovery of network motifs, and, indeed, even the methods for defining what they are, are critically reliant on a set of incorrect assumptions, casting a doubt on the scientific validity of motif-driven discoveries. The implications of these findings are therefore far-reaching across diverse areas of biology.

Keywords: Gene regulation; Network motifs; Network substructures.

MeSH terms

  • Algorithms
  • Computational Biology / methods*
  • Gene Expression Regulation
  • Gene Regulatory Networks*
  • Humans
  • Reproducibility of Results