Poisson approach to clustering analysis of regulatory sequences

Int J Comput Biol Drug Des. 2008;1(2):141-57. doi: 10.1504/ijcbdd.2008.020206.

Abstract

The presence of similar patterns in regulatory sequences may aid users in identifying co-regulated genes or inferring regulatory modules. By modelling pattern occurrences in regulatory regions with Poisson statistics, this paper presents a log likelihood ratio statistics-based distance measure to calculate pair-wise similarities between regulatory sequences. We employed it within three clustering algorithms: hierarchical clustering, Self-Organising Map, and a self-adaptive neural network. The results indicate that, in comparison to traditional clustering algorithms, the incorporation of the log likelihood ratio statistics-based distance into the learning process may offer considerable improvements in the process of regulatory sequence-based classification of genes.

MeSH terms

  • Algorithms*
  • Base Sequence*
  • Cluster Analysis
  • Likelihood Functions
  • Neural Networks, Computer
  • Poisson Distribution
  • Regulatory Sequences, Nucleic Acid*