Measuring similarity between gene interaction profiles

BMC Bioinformatics. 2019 Aug 22;20(1):435. doi: 10.1186/s12859-019-3024-x.

Abstract

Background: Gene and protein interaction data are often represented as interaction networks, where nodes stand for genes or gene products and each edge stands for a relationship between a pair of gene nodes. Commonly, that relationship within a pair is specified by high similarity between profiles (vectors) of experimentally defined interactions of each of the two genes with all other genes in the genome; only gene pairs that interact with similar sets of genes are linked by an edge in the network. The tight groups of genes/gene products that work together in a cell can be discovered by the analysis of those complex networks.

Results: We show that the choice of the similarity measure between pairs of gene vectors impacts the properties of networks and of gene modules detected within them. We re-analyzed well-studied data on yeast genetic interactions, constructed four genetic networks using four different similarity measures, and detected gene modules in each network using the same algorithm. The four networks induced different numbers of putative functional gene modules, and each similarity measure induced some unique modules. In an example of a putative functional connection suggested by comparing genetic interaction vectors, we predict a link between SUN-domain proteins and protein glycosylation in the endoplasmic reticulum.

Conclusions: The discovery of molecular modules in genetic networks is sensitive to the way of measuring similarity between profiles of gene interactions in a cell. In the absence of a formal way to choose the "best" measure, it is advisable to explore the measures with different mathematical properties, which may identify different sets of connections between genes.

Keywords: Gene networks; Genetic interactions; SUN domain; Similarity measures; Slp1.

MeSH terms

  • Algorithms
  • Computational Biology / methods*
  • Epistasis, Genetic*
  • Gene Regulatory Networks
  • Genes, Fungal
  • Glycosylation
  • Molecular Sequence Annotation
  • Protein Domains
  • Saccharomyces cerevisiae / genetics
  • Statistics as Topic