Abstract. We survey index structures for fast similarity search of objects represented by binary vectors (with components 0 or 1). Structures for both exact and approximate search by Hamming distance and other similarity measures are considered. Mainly, we present index structures based on hash tables, similarity-preserving hashing, as well as tree structures, neighborhood graphs, and neural distributed autoassociative memory. The ideas of specific algorithms, including the recently proposed ones, are outlined.
Keywords: similarity search, Hamming distance, nearest neighbor, near neighbor, index structures, multi-index hashing, locally-sensitive hashing, treelike structures, neighborhood graph, neural autoassociative memory.
International Scientific-Educational Center of Information Technologies and Systems, NAS of Ukraine and MES of
Ukraine, Kyiv, Ukraine,
e-mail: dar@infrm.kiev.ua.