Bregmannian consensus clustering for cancer subtypes analysis

Comput Methods Programs Biomed. 2020 Jun:189:105337. doi: 10.1016/j.cmpb.2020.105337. Epub 2020 Jan 13.

Abstract

Cancer subtype analysis, as an extension of cancer diagnosis, can be regarded as a consensus clustering problem. This analysis is beneficial for providing patients with more accurate treatment. Consensus clustering refers to a situation in which several different clusters have been obtained for a particular data set, and it is desired to aggregate those clustering results to get a better clustering solution. In this paper, we propose to generalize the traditional consensus clustering methods in three manners: (1) We provide Bregmannian consensus clustering (BCC), where the loss between the consensus clustering result and all the input clusterings are generalized from a traditional Euclidean distance to a general Bregman loss; (2) we generalize the BCC to a weighted case, where each input clustering has different weights, providing a better solution for the final clustering result; and (3) we propose a novel semi-supervised consensus clustering, which adds some must-link and cannot-link constraints compared with the first two methods. Then, we obtain three cancer (breast, lung, colorectal cancer) data sets from The Cancer Genome Atlas (TCGA). Each data set has three data types (mRNA, mircoRNA, methylation), and each is respectively used to test the accuracy of the proposed algorithms for clusterings. The experimental results demonstrate that the highest aggregation accuracy of the weighted BCC (WBCC) on cancer data sets is 90.2%. Moreover, although the lowest accuracy is 62.3%, it is higher than other methods on the same data set. Therefore, we conclude that as compared with the competition, our method is more effective.

Keywords: Bregman divergence; Cancer subtypes analysis; Consensus Clustering.

MeSH terms

  • Algorithms*
  • Cluster Analysis
  • Gene Expression Profiling* / statistics & numerical data
  • Methylation
  • MicroRNAs
  • Neoplasms / classification*
  • Neoplasms / genetics*
  • RNA, Messenger

Substances

  • MicroRNAs
  • RNA, Messenger