A strategy to incorporate prior knowledge into correlation network cutoff selection

Nat Commun. 2020 Oct 14;11(1):5153. doi: 10.1038/s41467-020-18675-3.

Abstract

Correlation networks are frequently used to statistically extract biological interactions between omics markers. Network edge selection is typically based on the statistical significance of the correlation coefficients. This procedure, however, is not guaranteed to capture biological mechanisms. We here propose an alternative approach for network reconstruction: a cutoff selection algorithm that maximizes the overlap of the inferred network with available prior knowledge. We first evaluate the approach on IgG glycomics data, for which the biochemical pathway is known and well-characterized. Importantly, even in the case of incomplete or incorrect prior knowledge, the optimal network is close to the true optimum. We then demonstrate the generalizability of the approach with applications to untargeted metabolomics and transcriptomics data. For the transcriptomics case, we demonstrate that the optimized network is superior to statistical networks in systematically retrieving interactions that were not included in the biological reference used for optimization.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Data Interpretation, Statistical
  • Glycomics / methods*
  • Glycomics / statistics & numerical data
  • Humans
  • Immunoglobulin G / metabolism
  • Metabolomics / methods*
  • Metabolomics / statistics & numerical data
  • RNA-Seq / methods*
  • RNA-Seq / statistics & numerical data

Substances

  • Immunoglobulin G

Associated data

  • figshare/10.6084/m9.figshare.12646748