Learning efficiency of redundant neural networks in Bayesian estimation

IEEE Trans Neural Netw. 2001;12(6):1475-86. doi: 10.1109/72.963783.

Abstract

This paper proves that the Bayesian stochastic complexity of a layered neural network is asymptotically smaller than that of a regular statistical model if it contains the true distribution. We consider a case when a three-layer perceptron with M input units, H hidden units and N output units is trained to estimate the true distribution represented by the model with H(0) hidden units and prove that the stochastic complexity is asymptotically smaller than (1/2) {H(0) (M+N)+R} log n where n is the number of training samples and R is a function of H-H(0), M, and N that is far smaller than the number of redundant parameters. Since the generalization error of Bayesian estimation is equal to the increase of stochastic complexity, it is smaller than (1/2n) {H(0) (M+N)+R} if it has an asymptotic expansion. Based on the results, the difference between layered neural networks and regular statistical models is discussed from the statistical point of view.