Zero-Shot Sketch-Based Image Retrieval Using StyleGen and Stacked Siamese Neural Networks

J Imaging. 2024 Mar 27;10(4):79. doi: 10.3390/jimaging10040079.

Abstract

Sketch-based image retrieval (SBIR) refers to a sub-class of content-based image retrieval problems where the input queries are ambiguous sketches and the retrieval repository is a database of natural images. In the zero-shot setup of SBIR, the query sketches are drawn from classes that do not match any of those that were used in model building. The SBIR task is extremely challenging as it is a cross-domain retrieval problem, unlike content-based image retrieval problems because sketches and images have a huge domain gap. In this work, we propose an elegant retrieval methodology, StyleGen, for generating fake candidate images that match the domain of the repository images, thus reducing the domain gap for retrieval tasks. The retrieval methodology makes use of a two-stage neural network architecture known as the stacked Siamese network, which is known to provide outstanding retrieval performance without losing the generalizability of the approach. Experimental studies on the image sketch datasets TU-Berlin Extended and Sketchy Extended, evaluated using the mean average precision (mAP) metric, demonstrate a marked performance improvement compared to the current state-of-the-art approaches in the domain.

Keywords: SSiNN; ZS-SBIR; domain gap; sketch-based image retrieval; stacked Siamese neural network.

Grants and funding

This research received no external funding.