On the Use of Normalized Compression Distances for Image Similarity Detection

Entropy (Basel). 2018 Jan 31;20(2):99. doi: 10.3390/e20020099.

Abstract

This paper investigates the usefulness of the normalized compression distance (NCD) for image similarity detection. Instead of the direct NCD between images, the paper considers the correlation between NCD based feature vectors extracted for each image. The vectors are derived by computing the NCD between the original image and sequences of translated (rotated) versions. Feature vectors for simple transforms (circular translations on horizontal, vertical, diagonal directions and rotations around image center) and several standard compressors are generated and tested in a very simple experiment of similarity detection between the original image and two filtered versions (median and moving average). The promising vector configurations (geometric transform, lossless compressor) are further tested for similarity detection on the 24 images of the Kodak set subject to some common image processing. While the direct computation of NCD fails to detect image similarity even in the case of simple median and moving average filtering in 3 × 3 windows, for certain transforms and compressors, the proposed approach appears to provide robustness at similarity detection against smoothing, lossy compression, contrast enhancement, noise addition and some robustness against geometrical transforms (scaling, cropping and rotation).

Keywords: NCD feature vectors; image similarity; lossless compression; normalized compression distance; normalized information distance; robust similarity.