Two-layer contractive encodings for learning stable nonlinear features

Hannes Schulz; Kyunghyun Cho; Tapani Raiko; Sven Behnke

doi:10.1016/j.neunet.2014.09.008

Two-layer contractive encodings for learning stable nonlinear features

Neural Netw. 2015 Apr:64:4-11. doi: 10.1016/j.neunet.2014.09.008. Epub 2014 Sep 28.

Authors

Hannes Schulz¹, Kyunghyun Cho², Tapani Raiko², Sven Behnke³

Affiliations

¹ Autonomous Intelligent Systems, Computer Science Institute VI, University of Bonn, Germany. Electronic address: schulz@ais.uni-bonn.de.
² Department of Information and Computer Science, Aalto University School of Science, Finland.
³ Autonomous Intelligent Systems, Computer Science Institute VI, University of Bonn, Germany.

PMID: 25292461
DOI: 10.1016/j.neunet.2014.09.008

Abstract

Unsupervised learning of feature hierarchies is often a good strategy to initialize deep architectures for supervised learning. Most existing deep learning methods build these feature hierarchies layer by layer in a greedy fashion using either auto-encoders or restricted Boltzmann machines. Both yield encoders which compute linear projections of input followed by a smooth thresholding function. In this work, we demonstrate that these encoders fail to find stable features when the required computation is in the exclusive-or class. To overcome this limitation, we propose a two-layer encoder which is less restricted in the type of features it can learn. The proposed encoder is regularized by an extension of previous work on contractive regularization. This proposed two-layer contractive encoder potentially poses a more difficult optimization problem, and we further propose to linearly transform hidden neurons of the encoder to make learning easier. We demonstrate the advantages of the two-layer encoders qualitatively on artificially constructed datasets as well as commonly used benchmark datasets. We also conduct experiments on a semi-supervised learning task and show the benefits of the proposed two-layer encoders trained with the linear transformation of perceptrons.

Keywords: Deep learning; Linear transformation; Multi-layer perceptron; Pretraining; Semi-supervised learning; Two-layer contractive encoding.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Algorithms*
Artificial Intelligence*
Neural Networks, Computer