Latent Dirichlet allocation based generative adversarial networks

Neural Netw. 2020 Dec:132:461-476. doi: 10.1016/j.neunet.2020.08.012. Epub 2020 Sep 21.

Abstract

Generative adversarial networks have been extensively studied in recent years and powered a wide range of applications, ranging from image generation, image-to-image translation, to text-to-image generation, and visual recognition. These methods typically model the mapping from latent space to image with single or multiple generators. However, they have obvious drawbacks: (i) ignoring the multi-modal structure of images, and (ii) lacking model interpretability. Importantly, the existing methods mostly assume one or more generators can cover all image modes even if we do not know the structure of data. Thus, mode dropping and collapse often take place along with GANs training. Despite the importance of exploring the data structure in generation, it has been almost unexplored. In this work, aiming at generating multi-modal images and interpreting model explicitly, we explore the theory on how to integrate GANs with data structure prior, and propose latent Dirichlet allocation based generative adversarial networks (LDAGAN). This framework is extended to combine with a variety of state-of-the-art single-generator GANs and achieves improved performance. Extensive experiments on synthetic and real datasets demonstrate the efficacy of LDAGAN for multi-modal image generation. An implementation of LDAGAN is available at https://github.com/Sumching/LDAGAN.

Keywords: Generative adversarial networks (GANs); Latent Dirichlet allocation (LDA); Model interpretability; Multi-modal structure prior.

MeSH terms

  • Image Processing, Computer-Assisted / methods*
  • Neural Networks, Computer*