K-means for shared frailty models

Usha Govindarajulu; Sandeep Bedi

doi:10.1186/s12874-021-01424-5

K-means for shared frailty models

BMC Med Res Methodol. 2022 Jan 12;22(1):11. doi: 10.1186/s12874-021-01424-5.

Authors

Usha Govindarajulu¹, Sandeep Bedi²

Affiliations

¹ Center for Biostatistics, Department of Population Health & Policy Icahn School of Medicine at Mount Sinai, One Gustave Levy Place, NY, New York, USA. usha.govindarajulu@mountsinai.org.
² Center for Biostatistics, Department of Population Health & Policy Icahn School of Medicine at Mount Sinai, One Gustave Levy Place, NY, New York, USA.

Abstract

Background: The purpose of this research was to see how the k-means algorithm can be applied to survival analysis with single events per subject for defining groups, which can then be modeled in a shared frailty model to further allow the capturing the unmeasured confounding not already explained by the covariates in the model.

Methods: For this purpose we developed our own k-means survival grouping algorithm to handle this approach. We compared a regular shared frailty model with a regular grouping variable and a shared frailty model with a k-means grouping variable in simulations as well as analysis on a real dataset.

Results: We found that in both simulations as well as real data showed that our k-means clustering is no different than the typical frailty clustering even under different situations of varied case rates and censoring. It appeared our k-means algorithm could be a trustworthy mechanism of creating groups from data when no grouping term exists for including in a frailty term in a survival model or comparing to an existing grouping variable available in the current data to use in a frailty model.

Keywords: Heterogeneity; Modified k-means algorithm; Shared frailty; Survival analysis.

MeSH terms

Algorithms
Frailty* / diagnosis
Humans
Models, Statistical
Survival Analysis