Causality in linear nongaussian acyclic models in the presence of latent gaussian confounders

Neural Comput. 2013 Jun;25(6):1605-41. doi: 10.1162/NECO_a_00444. Epub 2013 Mar 21.

Abstract

LiNGAM has been successfully applied to some real-world causal discovery problems. Nevertheless, causal sufficiency is assumed; that is, there is no latent confounder of the observations, which may be unrealistic for real-world problems. Taking into the consideration latent confounders will improve the reliability and accuracy of estimations of the real causal structures. In this letter, we investigate a model called linear nongaussian acyclic models in the presence of latent gaussian confounders (LiNGAM-GC) which can be seen as a specific case of lvLiNGAM. This model includes the latent confounders, which are assumed to be independent gaussian distributed and statistically independent of the disturbances. To tackle the causal discovery problem of this model, first we propose a pairwise cumulant-based measure of causal directions for cause-effect pairs. We prove that in spite of the presence of latent gaussian confounders, the causal direction of the observed cause-effect pair can be identified under the mild condition that the disturbances are simultaneously supergaussian or subgaussian. We propose a simple and efficient method to detect the violation of this condition. We extend our work to multivariate causal network discovery problems. Specifically we propose algorithms to estimate the causal network structure, including causal ordering and causal strengths, using an iterative root finding-removing scheme based on pairwise measure. To address the redundant edge problem due to the finite sample size effect, we develop an efficient bootstrapping-based pruning algorithm. Experiments on synthetic data and real-world data have been conducted to show the applicability of our model and the effectiveness of our proposed algorithms.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Causality*
  • Data Interpretation, Statistical*
  • Humans
  • Linear Models*
  • Neural Networks, Computer
  • Normal Distribution*