Discovering Heterogeneous Exposure Effects Using Randomization Inference in Air Pollution Studies

J Am Stat Assoc. 2021;116(534):569-580. doi: 10.1080/01621459.2020.1870476. Epub 2021 Feb 16.

Abstract

Several studies have provided strong evidence that long-term exposure to air pollution, even at low levels, increases risk of mortality. As regulatory actions are becoming prohibitively expensive, robust evidence to guide the development of targeted interventions to protect the most vulnerable is needed. In this paper, we introduce a novel statistical method that (i) discovers subgroups whose effects substantially differ from the population mean, and (ii) uses randomization-based tests to assess discovered heterogeneous effects. Also, we develop a sensitivity analysis method to assess the robustness of the conclusions to unmeasured confounding bias. Via simulation studies and theoretical arguments, we demonstrate that hypothesis testing focusing on the discovered subgroups can substantially increase statistical power to detect heterogeneity of the exposure effects. We apply the proposed denovo method to the data of 1,612,414 Medicare beneficiaries in the New England region in the United States for the period 2000 to 2006. We find that seniors aged between 81-85 with low income and seniors aged 85 and above have statistically significant greater causal effects of long-term exposure to PM2.5 on 5-year mortality rate compared to the population mean.

Keywords: Causal effect; Causal inference; Observational study; Particulate Matter; Recursive partitioning; Sample split; Unmeasured confounding.