Direct effects testing: a two-stage procedure to test for effect size and variable importance for correlated binary predictors and a binary response

M Sperrin; T Jaki

doi:10.1002/sim.4014

Direct effects testing: a two-stage procedure to test for effect size and variable importance for correlated binary predictors and a binary response

Stat Med. 2010 Oct 30;29(24):2544-56. doi: 10.1002/sim.4014.

Authors

M Sperrin¹, T Jaki

Affiliation

¹ Health Methodology Research Group, University of Manchester, Manchester, UK. matthew.sperrin@manchester.ac.uk

PMID: 20683850
DOI: 10.1002/sim.4014

Abstract

In applications such as medical statistics and genetics, we encounter situations where a large number of highly correlated predictors explain a response. For example, the response may be a disease indicator and the predictors may be treatment indicators or single nucleotide polymorphisms (SNPs). Constructing a good predictive model in such cases is well studied. Less well understood is how to recover the 'true sparsity pattern', that is finding which predictors have direct effects on the response, and indicating the statistical significance of the results. Restricting attention to binary predictors and response, we study the recovery of the true sparsity pattern using a two-stage method that separates establishing the presence of effects from inferring their exact relationship with the predictors. Simulations and a real data application demonstrate that the method discriminates well between associations and direct effects. Comparisons with lasso-based methods demonstrate favourable performance of the proposed method.

MeSH terms

Age of Onset
Alcohol Drinking / epidemiology
Comorbidity
Coronary Disease / epidemiology
Data Interpretation, Statistical*
Genome-Wide Association Study / methods
Humans
Models, Statistical*
Obesity / epidemiology
Regression Analysis
Risk Factors
Rural Health / statistics & numerical data
Smoking / epidemiology
South Africa / epidemiology