Multi-Relational Deep Hashing for Cross-Modal Search

Xiao Liang; Erkun Yang; Yanhua Yang; Cheng Deng

doi:10.1109/TIP.2024.3385656

Multi-Relational Deep Hashing for Cross-Modal Search

IEEE Trans Image Process. 2024:33:3009-3020. doi: 10.1109/TIP.2024.3385656. Epub 2024 Apr 25.

Authors

Xiao Liang, Erkun Yang, Yanhua Yang, Cheng Deng

PMID: 38625760
DOI: 10.1109/TIP.2024.3385656

Abstract

Deep cross-modal hashing retrieval has recently made significant progress. However, existing methods generally learn hash functions with pairwise or triplet supervisions, which involves learning the relevant information by splicing partial similarity between data pairs; notably, this approach only captures the data similarity locally and incompletely, resulting in sub-optimal retrieval performance. In this paper, we propose a novel Multi-Relational Deep Hashing (MRDH) approach, which can fully bridge the modality gap by comprehensively modeling the similarity relationship between data in different modalities. In more detail, to investigate the inter-modal relationships, we constrain the consistency of cross-modal pairwise similarities to maintain the semantic similarity across modalities. Moreover, to further capture complete similarity information, we design a new similarity metric, which we term cross-modal global similarity, by encouraging hash codes of similar data pairs from different modalities to approach a common center and hash codes for dissimilar pairs to converge to different centers. Adopting this approach enables our model to generate more discriminative hash codes. Extensive experiments on three benchmark datasets demonstrate the superiority of our method on cross-modal hashing retrieval.