Asymmetric distances for binary embeddings

IEEE Trans Pattern Anal Mach Intell. 2014 Jan;36(1):33-47. doi: 10.1109/TPAMI.2013.101.

Abstract

In large-scale query-by-example retrieval, embedding image signatures in a binary space offers two benefits: data compression and search efficiency. While most embedding algorithms binarize both query and database signatures, it has been noted that this is not strictly a requirement. Indeed, asymmetric schemes that binarize the database signatures but not the query still enjoy the same two benefits but may provide superior accuracy. In this work, we propose two general asymmetric distances that are applicable to a wide variety of embedding techniques including locality sensitive hashing (LSH), locality sensitive binary codes (LSBC), spectral hashing (SH), PCA embedding (PCAE), PCAE with random rotations (PCAE-RR), and PCAE with iterative quantization (PCAE-ITQ). We experiment on four public benchmarks containing up to 1M images and show that the proposed asymmetric distances consistently lead to large improvements over the symmetric Hamming distance for all binary embedding techniques.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.