Part of Advances in Neural Information Processing Systems 37 (NeurIPS 2024) Main Conference Track
Ming Yang, Yuzheng Cai, Weiguo Zheng
The state-of-the-art approximate nearest neighbor search (ANNS) algorithm builds a large proximity graph on the dataset and performs a greedy beam search, which may bring many unnecessary explorations. We develop a novel framework, namely corssing sparse proximity graph (CSPG), based on random partitioning of the dataset. It produces a smaller sparse proximity graph for each partition and routing vectors that bind all the partitions. An efficient two-staged approach is designed for exploring CSPG, with fast approaching and cross-partition expansion. We theoretically prove that CSPG can accelerate the existing graph-based ANNS algorithms by reducing unnecessary explorations. In addition, we conduct extensive experiments on benchmark datasets. The experimental results confirm that the existing graph-based methods can be significantly outperformed by incorporating CSPG, achieving 1.5x to 2x speedups of QPS in almost all recalls.