Masked Autoencoder for Distribution Estimation on Small Structured Data Sets

IEEE Trans Neural Netw Learn Syst. 2021 Nov;32(11):4997-5007. doi: 10.1109/TNNLS.2020.3026572. Epub 2021 Oct 27.

Abstract

Autoregressive models are among the most successful neural network methods for estimating a distribution from a set of samples. However, these models, such as other neural methods, need large data sets to provide good estimations. We believe that knowing structural information about the data can improve their performance on small data sets. Masked autoencoder for distribution estimation (MADE) is a well-structured density estimator, which alters a simple autoencoder by setting a set of masks on its connections to satisfy the autoregressive condition. Nevertheless, this model does not benefit from extra information that we might know about the structure of the data. This information can especially be advantageous in case of training on small data sets. In this article, we propose two autoencoders for estimating the density of a small set of observations, where the data have a known Markov random field (MRF) structure. These methods modify the masking process of MADE, according to conditional dependencies inferred from the MRF structure, to reduce either the model complexity or the problem complexity. We compare the proposed methods with some related binary, discrete, and continuous density estimators on MNIST, binarized MNIST, OCR-letters, and two synthetic data sets.