New role for circuit expansion for learning in neural networks

Julia Steinberg; Madhu Advani; Haim Sompolinsky

doi:10.1103/PhysRevE.103.022404

New role for circuit expansion for learning in neural networks

Phys Rev E. 2021 Feb;103(2-1):022404. doi: 10.1103/PhysRevE.103.022404.

Authors

Julia Steinberg^{1

2}, Madhu Advani¹, Haim Sompolinsky^{1

3}

Affiliations

¹ Center for Brain Science, Harvard University, Cambridge, Massachusetts 02138, USA.
² Department of Physics, Harvard University, Cambridge, Massachusetts 02138, USA.
³ Edmond and Lily Safra Center for Brain Sciences, Hebrew University, Jerusalem 91904, Israel.

PMID: 33736047
DOI: 10.1103/PhysRevE.103.022404

Abstract

Many sensory pathways in the brain include sparsely active populations of neurons downstream from the input stimuli. The biological purpose of this expanded structure is unclear, but it may be beneficial due to the increased expressive power of the network. In this work, we show that certain ways of expanding a neural network can improve its generalization performance even when the expanded structure is pruned after the learning period. To study this setting, we use a teacher-student framework where a perceptron teacher network generates labels corrupted with small amounts of noise. We then train a student network structurally matched to the teacher. In this scenario, the student can achieve optimal accuracy if given the teacher's synaptic weights. We find that sparse expansion of the input layer of a student perceptron network both increases its capacity and improves the generalization performance of the network when learning a noisy rule from a teacher perceptron when the expansion is pruned after learning. We find similar behavior when the expanded units are stochastic and uncorrelated with the input and analyze this network in the mean-field limit. By solving the mean-field equations, we show that the generalization error of the stochastic expanded student network continues to drop as the size of the network increases. This improvement in generalization performance occurs despite the increased complexity of the student network relative to the teacher it is trying to learn. We show that this effect is closely related to the addition of slack variables in artificial neural networks and suggest possible implications for artificial and biological neural networks.

MeSH terms

Learning*
Models, Neurological*
Nerve Net / physiology*