Multi-Relational Deep Hashing for Cross-Modal Search

IEEE Trans Image Process. 2024:33:3009-3020. doi: 10.1109/TIP.2024.3385656. Epub 2024 Apr 25.

Abstract

Deep cross-modal hashing retrieval has recently made significant progress. However, existing methods generally learn hash functions with pairwise or triplet supervisions, which involves learning the relevant information by splicing partial similarity between data pairs; notably, this approach only captures the data similarity locally and incompletely, resulting in sub-optimal retrieval performance. In this paper, we propose a novel Multi-Relational Deep Hashing (MRDH) approach, which can fully bridge the modality gap by comprehensively modeling the similarity relationship between data in different modalities. In more detail, to investigate the inter-modal relationships, we constrain the consistency of cross-modal pairwise similarities to maintain the semantic similarity across modalities. Moreover, to further capture complete similarity information, we design a new similarity metric, which we term cross-modal global similarity, by encouraging hash codes of similar data pairs from different modalities to approach a common center and hash codes for dissimilar pairs to converge to different centers. Adopting this approach enables our model to generate more discriminative hash codes. Extensive experiments on three benchmark datasets demonstrate the superiority of our method on cross-modal hashing retrieval.