Measuring the quality of linear patterns in biclusters

Methods. 2015 Jul 15:83:18-27. doi: 10.1016/j.ymeth.2015.04.005. Epub 2015 Apr 15.

Abstract

In microarray analysis, biclustering is used to find the maximal subsets of rows and columns satisfying some coherence criteria. The found submatrices are usually called as biclusters. On one hand, different criteria would help to find different types of biclusters, thus the definition of coherence criterion is critical to the biclustering method. On the other hand, qualitative criteria result to qualitative biclustering methods that cannot evaluate the qualities of the biclusters, while quantitative criteria can numerically show how well the mined biclusters and are more useful in real applications. In bioinformatics communities, there are several quantitative coherence measurements for linear patterns proposed. However, they face the problem of weakness in finding all subtypes of linear patterns or sensitivity to the noise. In this work, we introduce a coherence measurement for the general linear patterns, the minimal mean squared error (MMSE), which is designed to handle the evaluation of biclusters with shifting, scaling and the general linear (the mixed form of shifting and scaling) correlations. The experiments on synthetic and real data sets show that the proposed methods is appropriate for identifying significant general linear biclusters.

Keywords: Biclustering; Coherence measurement; Gene expression; Linear pattern.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Cluster Analysis*
  • Computational Biology*
  • Databases, Genetic / statistics & numerical data
  • Gene Expression
  • Gene Expression Profiling / statistics & numerical data*
  • Humans
  • Microarray Analysis / statistics & numerical data*