Approximate Learning of High Dimensional Bayesian Network Structures via Pruning of Candidate Parent Sets

Zhigao Guo; Anthony C Constantinou

doi:10.3390/e22101142

Approximate Learning of High Dimensional Bayesian Network Structures via Pruning of Candidate Parent Sets

Entropy (Basel). 2020 Oct 10;22(10):1142. doi: 10.3390/e22101142.

Authors

Zhigao Guo¹, Anthony C Constantinou^{1

2}

Affiliations

¹ Bayesian Artificial Intelligence Research Lab, School of Electronic Engineering and Computer Science, Queen Mary University of London, London E1 4NS, UK.
² The Alan Turing Institute, British Library, 96 Euston Road, London NW1 2DB, UK.

Abstract

Score-based algorithms that learn Bayesian Network (BN) structures provide solutions ranging from different levels of approximate learning to exact learning. Approximate solutions exist because exact learning is generally not applicable to networks of moderate or higher complexity. In general, approximate solutions tend to sacrifice accuracy for speed, where the aim is to minimise the loss in accuracy and maximise the gain in speed. While some approximate algorithms are optimised to handle thousands of variables, these algorithms may still be unable to learn such high dimensional structures. Some of the most efficient score-based algorithms cast the structure learning problem as a combinatorial optimisation of candidate parent sets. This paper explores a strategy towards pruning the size of candidate parent sets, and which could form part of existing score-based algorithms as an additional pruning phase aimed at high dimensionality problems. The results illustrate how different levels of pruning affect the learning speed relative to the loss in accuracy in terms of model fitting, and show that aggressive pruning may be required to produce approximate solutions for high complexity problems.

Keywords: probabilistic graphical models; pruning; structure learning.

Grants and funding

EP/S001646/1/Engineering and Physical Sciences Research Council