Optimizing partitioning strategies for faster inverted index compression

作者:Xingshen Song, Yuexiang Yang, Yu Jiang, Kun Jiang

摘要

The inverted index is a key component for search engines to manage billions of documents and quickly respond to users’ queries.Whereas substantial effort has been devoted to reducing space occupancy and decoding speed, the encoding speed when constructing the index has been overlooked. Partitioning the index aligning to its clustered distribution can effectively minimize the compressed size while accelerating its construction procedure. In this study, we introduce compression speed as one criterion to evaluate compression techniques, and thoroughly analyze the performance of different partitioning strategies. Optimizations are also proposed to enhance state-of-the-art methods with faster compression speed and more flexibility to partition an index. Experiments show that our methods offer a much better compression speed, while retaining an excellent space occupancy and decompression speed. networks.

论文关键词:inverted index, index compression, optimal partition, approximation algorithm

论文评审过程:

论文官网地址:https://doi.org/10.1007/s11704-016-6252-5