The causality measure of partial mutual information from mixed embedding (PMIME) revisited

Chaos. 2024 Mar 1;34(3):033113. doi: 10.1063/5.0189056.

Abstract

The measure of partial mutual information from mixed embedding (PMIME) is an information theory-based measure to accurately identify the direct and directional coupling, termed Granger causality or simply causality, between the observed variables or subsystems of a high-dimensional dynamical and complex system, without any a priori assumptions about the nature of the coupling relationship. In its core, it is a forward selection procedure that aims to iteratively identify the lag-dependence structure of a given observed variable (response) to all the other observed variables (candidate drivers). This model-free approach is capable of detecting nonlinear interactions, abundantly present in real-world complex systems, and it was shown to perform well on multivariate time series of moderately high dimension. However, the PMIME presents some inefficiencies in its performance mainly when applied on strongly stochastic (linear or nonlinear) systems as it may falsely detect non-existent relationships. Moreover, and by construction, the measure cannot extract purely synergetic relationships present in a system. In the current work, the issue of false detections is addressed by introducing an improved resampling significance test and a procedure of rechecking the identified drivers (backward revision). Regarding the inability to detect synergetic relationships, the PMIME is further enhanced by checking pairs as candidate drivers for the response variable after having considered all drivers individually. The effects of these modifications are investigated in a systematic simulation study on properly designed systems involving strong stochasticity, regressor terms with synergetic effects, and a system dimension ranging from 3 to 30. The overall results of the simulations indicate that these modifications indeed improve the performance of PMIME and alleviate to a significant degree the issues of the original algorithm. Guidelines for balancing between accuracy and computational efficiency are also given, particularly relevant for real-world applications. Finally, the measure performance is investigated in the study of futures of various government bonds and stock market indices in the period around COVID-19 pandemic.