Fast spectral analysis for approximate nearest neighbor search

作者:Jing Wang, Jie Shen

摘要

In large-scale machine learning, of central interest is the problem of approximate nearest neighbor (ANN) search, where the goal is to query particular points that are close to a given object under certain metric. In this paper, we develop a novel data-driven ANN search algorithm where the data structure is learned by fast spectral technique based on s landmarks selected by approximate ridge leverage scores. We show that with overwhelming probability, our algorithm returns the \((1+\epsilon /4)\)-ANN for any approximation parameter \(\epsilon \in (0,1)\). A remarkable feature of our algorithm is that it is computationally efficient. Specifically, learning k-length hash codes requires \(O((s^3+ns^2)\log n)\) running time and \(O(d^2)\) extra space, and returning the \((1+\epsilon /4)\)-ANN of the query needs \(O(k\log n)\) running time. The experimental results on computer vision and natural language understanding tasks demonstrate the significant advantage of our algorithm compared to state-of-the-art methods.

论文关键词:Approximate nearest neighbor search, Spectral analysis, Hashing, Noise, Subspace

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10994-021-06124-1