Scoring the correlation of genes by their shared properties using OScal, an improved overlap quantification model

Sci Rep. 2015 May 27:5:10583. doi: 10.1038/srep10583.

Abstract

Scoring the correlation between two genes by their shared properties is a common and basic work in biological study. A prospective way to score this correlation is to quantify the overlap between the two sets of homogeneous properties of the two genes. However the proper model has not been decided, here we focused on studying the quantification of overlap and proposed a more effective model after theoretically compared 7 existing models. We defined three characteristic parameters (d, R, r) of an overlap, which highlight essential differences among the 7 models and grouped them into two classes. Then the pros and cons of the two groups of model were fully examined by their solution space in the (d, R, r) coordinate system. Finally we proposed a new model called OScal (Overlap Score calculator), which was modified on Poisson distribution (one of 7 models) to avoid its disadvantages. Tested in assessing gene relation using different data, OScal performs better than existing models. In addition, OScal is a basic mathematic model, with very low computation cost and few restrictive conditions, so it can be used in a wide-range of research areas to measure the overlap or similarity of two entities.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Models, Genetic*