Extreme pivots: a pivot selection strategy for faster metric search
作者:Guillermo Ruiz, Edgar Chavez, Ubaldo Ruiz, Eric S. Tellez
摘要
This manuscript presents the extreme pivots (EP) metric index, a data structure, to speed up exact proximity searching in the metric space model. For the EP, we designed an automatic rule to select the best pivots for a dataset working on limited memory resources. The net effect is that our approach solves queries efficiently with a small memory footprint, and without a prohibitive construction time. In contrast with other related structures, our performance is achieved automatically without dealing directly with the index’s parameters, using optimization techniques over a model of the index. The EP’s model is studied in-depth in this contribution. In practical terms, an interested user only needs to provide the available memory and a sample of the query distribution as parameters. The resulting index is quickly built, and has a good trade-off among memory usage, preprocessing, and search time. We provide an extensive experimental comparison with state-of-the-art searching methods. We also carefully compared the performance of metric indexes in several scenarios, firstly with synthetic data to characterize performance as a function of the intrinsic dimension and the size of the database, and also in different real-world datasets with excellent results.
论文关键词:Nearest neighbors search, Pivot-based metric indexes, Extreme pivots
论文评审过程:
论文官网地址:https://doi.org/10.1007/s10115-019-01423-5