Fingerprints in compressed strings

作者:

Highlights:

摘要

In this paper we show how to construct a data structure for a string S of size N compressed into a context-free grammar of size n that supports efficient Karp–Rabin fingerprint queries to any substring of S. That is, given indices i and j, the answer to a query is the fingerprint of the substring S[i,j]. We present the first O(n) space data structures that answer fingerprint queries without decompressing any characters. For Straight Line Programs (SLP) we get O(log⁡N) query time, and for Linear SLPs (an SLP derivative that captures LZ78 compression and its variations) we get O(log⁡log⁡N) query time. We extend the result to solve the longest common extension problem in query time O(log⁡Nlog⁡ℓ) and O(log⁡ℓlog⁡log⁡ℓ+log⁡log⁡N) for SLPs and Linear SLPs, respectively. Here, ℓ denotes the length of the LCE.

论文关键词:Karp–Rabin fingerprints,Grammar compressed string,Longest common extensions

论文评审过程:Received 28 October 2013, Revised 11 January 2017, Accepted 13 January 2017, Available online 18 January 2017, Version of Record 27 February 2017.

论文官网地址:https://doi.org/10.1016/j.jcss.2017.01.002