Bayesian consensus clustering for multivariate longitudinal data

Stat Med. 2022 Jan 15;41(1):108-127. doi: 10.1002/sim.9225. Epub 2021 Oct 20.

Abstract

In clinical and epidemiological studies, there is a growing interest in studying the heterogeneity among patients based on longitudinal characteristics to identify subtypes of the study population. Compared to clustering a single longitudinal marker, simultaneously clustering multiple longitudinal markers allow additional information to be incorporated into the clustering process, which reveals co-existing longitudinal patterns and generates deeper biological insight. In the current study, we propose a Bayesian consensus clustering (BCC) model for multivariate longitudinal data. Instead of arriving at a single overall clustering, the proposed model allows each marker to follow marker-specific local clustering and these local clusterings are aggregated to find a global (consensus) clustering. To estimate the posterior distribution of model parameters, a Gibbs sampling algorithm is proposed. We apply our proposed model to the primary biliary cirrhosis study to identify patient subtypes that may be associated with their prognosis. We also perform simulation studies to compare the clustering performance between the proposed model and existing models under several scenarios. The results demonstrate that the proposed BCC model serves as a useful tool for clustering multivariate longitudinal data.

Keywords: Bayesian consensus clustering; disease clustering; integrative clustering; mixture model; multivariate longitudinal data.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Bayes Theorem
  • Cluster Analysis
  • Computer Simulation
  • Consensus
  • Humans