A mathematical model for estimating the effectiveness of bigram coding
作者:
Highlights:
•
摘要
This paper discusses bigram coding as a technique for compacting data. A mathematical model is developed that estimates the effectiveness of such a code as a function of the fraction of bigram tokens that are encodeable: this model accounts for the degree of overlap of encodeable tokens by assuming that bigram token occurrences have a Markov property. The model requires that a single parameter be fit to the data. The results of an experiment testing this model on a file of catalog data in a library is given, and excellent agreement is found. This model provides substantial improvement over an earlier model in which bigrams are assumed to occur independently of each other.
论文关键词:
论文评审过程:Available online 13 July 2002.
论文官网地址:https://doi.org/10.1016/0306-4573(76)90041-8