SANA NetGO: a combinatorial approach to using Gene Ontology (GO) terms to score network alignments

Bioinformatics. 2018 Apr 15;34(8):1345-1352. doi: 10.1093/bioinformatics/btx716.

Abstract

Motivation: Gene Ontology (GO) terms are frequently used to score alignments between protein-protein interaction (PPI) networks. Methods exist to measure GO similarity between proteins in isolation, but proteins in a network alignment are not isolated: each pairing is dependent on every other via the alignment itself. Existing measures fail to take into account the frequency of GO terms across networks, instead imposing arbitrary rules on when to allow GO terms.

Results: Here we develop NetGO, a new measure that naturally weighs infrequent, informative GO terms more heavily than frequent, less informative GO terms, without arbitrary cutoffs, instead downweighting GO terms according to their frequency in the networks being aligned. This is a global measure applicable only to alignments, independent of pairwise GO measures, in the same sense that the edge-based EC or S3 scores are global measures of topological similarity independent of pairwise topological similarities. We demonstrate the superiority of NetGO in alignments of predetermined quality and show that NetGO correlates with alignment quality better than any existing GO-based alignment measures. We also demonstrate that NetGO provides a measure of taxonomic similarity between species, consistent with existing taxonomic measuresa feature not shared with existing GObased network alignment measures. Finally, we re-score alignments produced by almost a dozen aligners from a previous study and show that NetGO does a better job at separating good alignments from bad ones.

Availability and implementation: Available as part of SANA.

Contact: whayes@uci.edu.

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Computational Biology / methods*
  • Data Accuracy
  • Eukaryota / genetics
  • Eukaryota / metabolism
  • Evolution, Molecular
  • Gene Ontology*
  • Humans
  • Protein Interaction Maps*
  • Software*