To index or not to index: Time–space trade-offs for positional ranking functions in search engines

作者:

Highlights:

• Positional ranking and text snippets can be generated efficiently from the text.

• Compression boosting+LZ4 compression offers the most attractive trade-off.

• Search process can be supported using 1.3 times the space of positional indexes.

摘要

•Positional ranking and text snippets can be generated efficiently from the text.•Compression boosting+LZ4 compression offers the most attractive trade-off.•Search process can be supported using 1.3 times the space of positional indexes.

论文关键词:Positional indexing,Text compression,Index compression,Wavelet trees,Snippet generation

论文评审过程:Received 27 April 2018, Revised 25 March 2019, Accepted 4 November 2019, Available online 14 November 2019, Version of Record 19 December 2019.

论文官网地址:https://doi.org/10.1016/j.is.2019.101466