Covariate selection in causal learning under non-Gaussianity

Behav Res Methods. 2023 Sep 13. doi: 10.3758/s13428-023-02217-y. Online ahead of print.

Abstract

Understanding causal mechanisms is a central goal in the behavioral, developmental, and social sciences. When estimating and probing causal effects using observational data, covariate adjustment is a crucial element to remove dependencies between focal predictors and the error term. Covariate selection, however, constitutes a challenging task because availability alone is not an adequate criterion to decide whether a covariate should be included in the statistical model. The present study introduces a non-Gaussian method for covariate selection and provides a forward selection algorithm for linear models (i.e., non-Gaussian forward selection; nGFS) to select appropriate covariates from a set of potential control variables to avoid inconsistent and biased estimators of the causal effect of interest. Further, we demonstrate that the forward selection algorithm has properties compatible with principles of direction of dependence, i.e., probing whether the causal target model is correctly specified with respect to the causal direction of effects. Results of a Monte Carlo simulation study suggest that the selection algorithm performs well, in particular when sample sizes are large (i.e., n ≥ 250) and data strongly deviate from Gaussianity (e.g., distributions with skewness beyond 1.5). An empirical example is given for illustrative purposes.

Keywords: Causal inference; Collider; Confounder; Covariate selection; Forward selection.