Statistical Inference for Cox Proportional Hazards Models with a Diverging Number of Covariates

Scand Stat Theory Appl. 2023 Jun;50(2):550-571. doi: 10.1111/sjos.12595. Epub 2022 Apr 25.

Abstract

For statistical inference on regression models with a diverging number of covariates, the existing literature typically makes sparsity assumptions on the inverse of the Fisher information matrix. Such assumptions, however, are often violated under Cox proportion hazards models, leading to biased estimates with under-coverage confidence intervals. We propose a modified debiased lasso method, which solves a series of quadratic programming problems to approximate the inverse information matrix without posing sparse matrix assumptions. We establish asymptotic results for the estimated regression coefficients when the dimension of covariates diverges with the sample size. As demonstrated by extensive simulations, our proposed method provides consistent estimates and confidence intervals with nominal coverage probabilities. The utility of the method is further demonstrated by assessing the effects of genetic markers on patients' overall survival with the Boston Lung Cancer Survival Cohort, a large-scale epidemiology study investigating mechanisms underlying the lung cancer.

Keywords: cancer epidemiology; debiased lasso; lung cancer; precision matrix; quadratic programming; sparsity.