A Confident Information First Principle for Parameter Reduction and Model Selection of Boltzmann Machines

Xiaozhao Zhao; Yuexian Hou; Dawei Song; Wenjie Li

doi:10.1109/TNNLS.2017.2664100

A Confident Information First Principle for Parameter Reduction and Model Selection of Boltzmann Machines

IEEE Trans Neural Netw Learn Syst. 2018 May;29(5):1608-1621. doi: 10.1109/TNNLS.2017.2664100. Epub 2017 Mar 16.

Authors

Xiaozhao Zhao, Yuexian Hou, Dawei Song, Wenjie Li

PMID: 28328513
DOI: 10.1109/TNNLS.2017.2664100

Abstract

Typical dimensionality reduction (DR) methods are data-oriented, focusing on directly reducing the number of random variables (or features) while retaining the maximal variations in the high-dimensional data. Targeting unsupervised situations, this paper aims to address the problem from a novel perspective and considers model-oriented DR in parameter spaces of binary multivariate distributions. Specifically, we propose a general parameter reduction criterion, called confident-information-first (CIF) principle, to maximally preserve confident parameters and rule out less confident ones. Formally, the confidence of each parameter can be assessed by its contribution to the expected Fisher information distance within a geometric manifold over the neighborhood of the underlying real distribution. Then, we demonstrate two implementations of CIF in different scenarios. First, when there are no observed samples, we revisit the Boltzmann machines (BMs) from a model selection perspective and theoretically show that both the fully visible BM and the BM with hidden units can be derived from the general binary multivariate distribution using the CIF principle. This finding would help us uncover and formalize the essential parts of the target density that BM aims to capture and the nonessential parts that BM should discard. Second, when there exist observed samples, we apply CIF to the model selection for BM, which is in turn made adaptive to the observed samples. The sample-specific CIF is a heuristic method to decide the priority order of parameters, which can improve the search efficiency without degrading the quality of model selection results as shown in a series of density estimation experiments.

Publication types

Research Support, Non-U.S. Gov't