Models of bitmap generation: A systematic approach to bitmap compression

作者:

Highlights:

摘要

In large IR systems, information about word occurrence may be stored in the form of a bit matrix, with rows corresponding to different words and columns to documents. Such a matrix is generally very large and very sparse. New methods for compressing such matrices are presented, which exploit possible correlations between rows and between columns. The methods are based on partitioning the matrix into small blocks and predicting the 1-bit distribution within a block by means of various bit generation models. Each block is then encoded using Huffman or arithmetic coding. The methods also use a new way of enumerating subsets of fixed size from a given superset. Preliminary experimental results indicate improvements over previous methods.

论文关键词:

论文评审过程:Available online 19 July 2002.

论文官网地址:https://doi.org/10.1016/0306-4573(92)90065-8