Controllable and Identity-Aware Facial Attribute Transformation

Daniel Stanley Tan; Jonathan Hans Soeseno; Kai-Lung Hua

doi:10.1109/TCYB.2021.3071172

Controllable and Identity-Aware Facial Attribute Transformation

IEEE Trans Cybern. 2022 Jun;52(6):4825-4836. doi: 10.1109/TCYB.2021.3071172. Epub 2022 Jun 16.

Authors

Daniel Stanley Tan, Jonathan Hans Soeseno, Kai-Lung Hua

PMID: 34043518
DOI: 10.1109/TCYB.2021.3071172

Abstract

Modifying facial attributes without the paired dataset proves to be a challenging task. Previously, approaches either required supervision from a ground-truth transformed image or required training a separate model for mapping every pair of attributes. These limit the scalability of the models to accommodate a larger set of attributes since the number of models that we need to train grows exponentially large. Another major drawback of the previous approaches is the unintentional gain of the identity of the person as they transform the facial attributes. We propose a method that allows for controllable and identity-aware transformations across multiple facial attributes using only a single model. Our approach is to train a generative adversarial network (GAN) with a multitask conditional discriminator that recognizes the identity of the face, distinguishes real images from fake, as well as identifies facial attributes present in an image. This guides the generator into producing an output that is realistic while preserving the person's identity and facial attributes. Through this framework, our model also learns meaningful image representations in a lower dimensional latent space and semantically associate separate parts of the encoded vector with both the person's identity and facial attributes. This opens up the possibility of generating new faces and other transformations such as making the face thinner or chubbier. Furthermore, our model only encodes the image once and allows for multiple transformations using the encoded vector. This allows for faster transformations since it does not need to reprocess the entire image for every transformation. We show the effectiveness of our proposed method through both qualitative and quantitative evaluations, such as ablative studies, visual inspection, and face verification. Competitive results are achieved compared to the main competition (CycleGAN), however, at great space and extensibility gain by using a single model.