Faster proximity searching with the distal SAT

作者:

Highlights:

摘要

Searching by proximity has been a source of puzzling behaviors and counter-intuitive findings for well established algorithmic design rules. One example is a linked list; it is the worst data structure for exact searching, and one of the most competitive for proximity searching. Common sense also dictates that an online data structure is less competitive than the full-knowledge, static version. A counter example in proximity searching is the static Spatial Approximation Tree (SAT), which is slower than its dynamic version (DSAT).In this paper we show that changing only the insertion policy of the SAT, leaving every other aspect of the data structure untouched, can produce a systematically faster index. We call the index Distal Spatial Approximation Tree (DiSAT). We found that even a random insertion policy produce a faster version of the SAT, which explains why the DSAT is faster than SAT. In brief, the SAT is improved by selecting distal, instead of proximal, nodes. This is the exact opposite of the insertion policy proposed in the original paper, and can be used in main or secondary memory versions of the index.We tested our approach with representatives of the state of the art in exact proximity searching. As it happens often in experimental setups, there are no absolute winners in all the aspects tested. Our data structure has no parameters to tune-up and a small memory footprint. In addition it can be constructed quickly. Our approach is among the most competitive, those outperforming DiSAT achieve this at the expense of larger memory usage or an impractical construction time.

论文关键词:Similarity search,Metric spaces,Metric access methods

论文评审过程:Available online 9 January 2016, Version of Record 30 May 2016.

论文官网地址:https://doi.org/10.1016/j.is.2015.10.014