STDNet: Rethinking Disentanglement Learning With Information Theory

Ziwen Liu; Mingqiang Li; Congying Han; Siqi Tang; Tiande Guo

doi:10.1109/TNNLS.2023.3241791

STDNet: Rethinking Disentanglement Learning With Information Theory

IEEE Trans Neural Netw Learn Syst. 2023 Feb 9:PP. doi: 10.1109/TNNLS.2023.3241791. Online ahead of print.

Authors

Ziwen Liu, Mingqiang Li, Congying Han, Siqi Tang, Tiande Guo

PMID: 37022811
DOI: 10.1109/TNNLS.2023.3241791

Abstract

Disentangled representation learning is typically achieved by a generative model, variational encoder (VAE). Existing VAE-based methods try to disentangle all the attributes simultaneously in a single hidden space, while the separation of the attribute from irrelevant information varies in complexity. Thus, it should be conducted in different hidden spaces. Therefore, we propose to disentangle the disentanglement itself by assigning the disentanglement of each attribute to different layers. To achieve this, we present a stair disentanglement net (STDNet), a stair-like structure network with each step corresponding to the disentanglement of an attribute. An information separation principle is employed to peel off the irrelevant information to form a compact representation of the targeted attribute within each step. Compact representations, thus, obtained together form the final disentangled representation. To ensure the final disentangled representation is compressed as well as complete with respect to the input data, we propose a variant of the information bottleneck (IB) principle, the stair IB (SIB) principle, to optimize a tradeoff between compression and expressiveness. In particular, for the assignment to the network steps, we define an attribute complexity metric to assign the attributes by the complexity ascending rule (CAR) that dictates a sequencing of the attribute disentanglement in ascending order of complexity. Experimentally, STDNet achieves state-of-the-art results in representation learning and image generation on multiple benchmarks, including Mixed National Institute of Standards and Technology database (MNIST), dSprites, and CelebA. Furthermore, we conduct thorough ablation experiments to show how the strategies employed here contribute to the performance, including neurons block, CAR, hierarchical structure, and variational form of SIB.