Treed distributed lag nonlinear models

Biostatistics. 2022 Jul 18;23(3):754-771. doi: 10.1093/biostatistics/kxaa051.

Abstract

In studies of maternal exposure to air pollution, a children's health outcome is regressed on exposures observed during pregnancy. The distributed lag nonlinear model (DLNM) is a statistical method commonly implemented to estimate an exposure-time-response function when it is postulated the exposure effect is nonlinear. Previous implementations of the DLNM estimate an exposure-time-response surface parameterized with a bivariate basis expansion. However, basis functions such as splines assume smoothness across the entire exposure-time-response surface, which may be unrealistic in settings where the exposure is associated with the outcome only in a specific time window. We propose a framework for estimating the DLNM based on Bayesian additive regression trees. Our method operates using a set of regression trees that each assume piecewise constant relationships across the exposure-time space. In a simulation, we show that our model outperforms spline-based models when the exposure-time surface is not smooth, while both methods perform similarly in settings where the true surface is smooth. Importantly, the proposed approach is lower variance and more precisely identifies critical windows during which exposure is associated with a future health outcome. We apply our method to estimate the association between maternal exposures to PM$_{2.5}$ and birth weight in a Colorado, USA birth cohort.

Keywords: Air pollution; Children’s health; Critical windows; Distributed lag; Regression trees.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.
  • Research Support, N.I.H., Extramural

MeSH terms

  • Air Pollutants* / analysis
  • Air Pollution*
  • Bayes Theorem
  • Child
  • Female
  • Humans
  • Maternal Exposure / adverse effects
  • Nonlinear Dynamics
  • Particulate Matter / analysis
  • Pregnancy

Substances

  • Air Pollutants
  • Particulate Matter