Short-sighted deep learning

Phys Rev E. 2020 Jul;102(1-1):013307. doi: 10.1103/PhysRevE.102.013307.

Abstract

A theory explaining how deep learning works is yet to be developed. Previous work suggests that deep learning performs a coarse graining, similar in spirit to the renormalization group (RG). This idea has been explored in the setting of a local (nearest-neighbor interactions) Ising spin lattice. We extend the discussion to the setting of a long-range spin lattice. Markov-chain Monte Carlo (MCMC) simulations determine both the critical temperature and scaling dimensions of the system. The model is used to train both a single restricted Boltzmann machine (RBM) network, as well as a stacked RBM network. Following earlier Ising model studies, the trained weights of a single-layer RBM network define a flow of lattice models. In contrast to results for nearest-neighbor Ising, the RBM flow for the long-ranged model does not converge to the correct values for the spin and energy scaling dimension. Further, correlation functions between visible and hidden nodes exhibit key differences between the stacked RBM and RG flows. The stacked RBM flow appears to move toward low temperatures, whereas the RG flow moves toward high temperature. This again differs from results obtained for nearest-neighbor Ising.