An Efficient Entropy-Based Causal Discovery Method for Linear Structural Equation Models With IID Noise Variables

IEEE Trans Neural Netw Learn Syst. 2020 May;31(5):1667-1680. doi: 10.1109/TNNLS.2019.2921613. Epub 2019 Jul 3.

Abstract

The discovery of causal relationships from the observational data is an important task. To identify the unique causal structure belonging to a Markov equivalence class, a number of algorithms, such as the linear non-Gaussian acyclic model (LiNGAM), have been proposed. However, two challenges remain to be met: 1) these algorithms fail to work on the data which follow linear structural equation model with Gaussian noise and 2) they misjudge the causal direction when the data contain additional measurement errors. In this paper, we propose an entropy-based two-phase iterative algorithm for arbitrary distribution data with additional measurement errors under some mild assumptions. In the first phase of the algorithm, based on the property that entropy can measure the amount of information behind the data with arbitrary distribution, we design a general approach for the identification of exogenous variable on both Gaussian and non-Gaussian data, and we give the corresponding theoretical derivation. In the second phase, to eliminate the effects of measurement errors, we revise the value of the exogenous variable by removing its measurement error and further use the revised value to remove its effect on the remaining variables. Experimental results on real-world causal structures are presented to demonstrate the effectiveness and stability of our method. We also apply the proposed algorithm on the mobile-base-station data with measurement errors, and the results further prove the effectiveness of our algorithm.