Constraint-based causal discovery with mixed data

Int J Data Sci Anal. 2018;6(1):19-30. doi: 10.1007/s41060-018-0097-y. Epub 2018 Feb 2.

Abstract

We address the problem of constraint-based causal discovery with mixed data types, such as (but not limited to) continuous, binary, multinomial, and ordinal variables. We use likelihood-ratio tests based on appropriate regression models and show how to derive symmetric conditional independence tests. Such tests can then be directly used by existing constraint-based methods with mixed data, such as the PC and FCI algorithms for learning Bayesian networks and maximal ancestral graphs, respectively. In experiments on simulated Bayesian networks, we employ the PC algorithm with different conditional independence tests for mixed data and show that the proposed approach outperforms alternatives in terms of learning accuracy.

Keywords: Bayesian networks; Conditional independence tests; Constraint-based learning; Maximal ancestral graphs; Mixed data.