HiFiSketch: High Fidelity Face Photo-Sketch Synthesis and Manipulation

Chunlei Peng; Congyu Zhang; Decheng Liu; Nannan Wang; Xinbo Gao

doi:10.1109/TIP.2023.3326680

HiFiSketch: High Fidelity Face Photo-Sketch Synthesis and Manipulation

IEEE Trans Image Process. 2023:32:5865-5876. doi: 10.1109/TIP.2023.3326680. Epub 2023 Nov 3.

Authors

Chunlei Peng, Congyu Zhang, Decheng Liu, Nannan Wang, Xinbo Gao

PMID: 37889808
DOI: 10.1109/TIP.2023.3326680

Abstract

With the rapid development of generative adversarial networks, face photo-sketch synthesis has achieved promising performance and playing an increasingly important role in law enforcement as well as entertainment. However, most of the existing methods only work under the condition of no interference, and lack of generalization ability in wild scenes. The fidelity of the images generated by the existing methods are insufficient, and the manipulation ability according to text description is unavailable. Directly applying existing text-based image manipulation methods on face photo-sketch scenario may lead to severe distortions due to the cross-domain challenges. Therefore, we propose a novel cross-domain face photo-sketch synthesis framework named HiFiSketch, a network that learns to adjust the weights of generators for high-fidelity synthesis and manipulation. It can realize the translation of images between the photo domain and the sketch domain, and modify results according to the text input in the meanwhile. We further propose a cross-domain loss function, which can effectively preserve facial details during face photo-sketch synthesis. Extensive experiments on four public face sketch datasets show the superiority of our method compared to existing methods. We further present text-based face photo-sketch manipulation and sequential face photo-sketch manipulation for the first time to demonstrate the effectiveness of our method on high fidelity face photo-sketch synthesis and manipulation.