Weighting estimation in the cause-specific Cox regression with partially missing causes of failure

Stat Med. 2024 Apr 24. doi: 10.1002/sim.10084. Online ahead of print.

Abstract

Complex diseases are often analyzed using disease subtypes classified by multiple biomarkers to study pathogenic heterogeneity. In such molecular pathological epidemiology research, we consider a weighted Cox proportional hazard model to evaluate the effect of exposures on various disease subtypes under competing-risk settings in the presence of partially or completely missing biomarkers. The asymptotic properties of the inverse and augmented inverse probability-weighted estimating equation methods are studied with a general pattern of missing data. Simulation studies have been conducted to demonstrate the double robustness of the estimators. For illustration, we applied this method to examine the association between pack-years of smoking before the age of 30 and the incidence of colorectal cancer subtypes defined by a combination of four tumor molecular biomarkers (statuses of microsatellite instability, CpG island methylator phenotype, BRAF mutation, and KRAS mutation) in the Nurses' Health Study cohort.

Keywords: augmented inverse probability weighting; competing risks; etiologic heterogeneity; partially missing causes.