Nonlinear Causal Discovery for High-Dimensional Deterministic Data

IEEE Trans Neural Netw Learn Syst. 2023 May;34(5):2234-2245. doi: 10.1109/TNNLS.2021.3106111. Epub 2023 May 2.

Abstract

Nonlinear causal discovery with high-dimensional data where each variable is multidimensional plays a significant role in many scientific disciplines, such as social network analysis. Previous work majorly focuses on exploiting asymmetry in the causal and anticausal directions between two high-dimensional variables (a cause-effect pair). Although there exist some works that concentrate on the causal order identification between multiple variables, i.e., more than two high-dimensional variables, they do not validate the consistency of methods through theoretical analysis on multiple-variable data. In particular, based on the asymmetry for the cause-effect pair, if model assumptions for any pair of the data are violated, the asymmetry condition will not hold, resulting in the deduction of incorrect order identification. Thus, in this article, we propose a causal functional model, namely high-dimensional deterministic model (HDDM), to identify the causal orderings among multiple high-dimensional variables. We derive two candidates' selection rules to alleviate the inconvenient effects resulted from the violated-assumption pairs. The corresponding theoretical justification is provided as well. With these theoretical results, we develop a method to infer causal orderings for nonlinear multiple-variable data. Simulations on synthetic data and real-world data are conducted to verify the efficacy of our proposed method. Since we focus on deterministic relations in our method, we also verify the robustness of the noises in simulations.