Behind Distribution Shift: Mining Driving Forces of Changes and Causal Arrows

Proc IEEE Int Conf Data Min. 2017 Nov:2017:913-918. doi: 10.1109/ICDM.2017.114. Epub 2017 Dec 18.

Abstract

We address two important issues in causal discovery from nonstationary or heterogeneous data, where parameters associated with a causal structure may change over time or across data sets. First, we investigate how to efficiently estimate the "driving force" of the nonstationarity of a causal mechanism. That is, given a causal mechanism that varies over time or across data sets and whose qualitative structure is known, we aim to extract from data a low-dimensional and interpretable representation of the main components of the changes. For this purpose we develop a novel kernel embedding of nonstationary conditional distributions that does not rely on sliding windows. Second, the embedding also leads to a measure of dependence between the changes of causal modules that can be used to determine the directions of many causal arrows. We demonstrate the power of our methods with experiments on both synthetic and real data.