Structural properties of the string statistics problem
作者:
Highlights:
•
摘要
A suitably weighted Index Tree such as a B-tree or a Suffix Tree can be easily adapted to store, for a given string x and for all substrings w of x, the number of distinct instances of w along x. The storage needed is seen to be linear in the length of x: moreover, the whole statistics can itself be derived in linear time, off-line of a RAM. If the substring w has nontrivial periods, however, the number of distinct instances might differ from that of distinct non-overlapping occurrences along x. It is shown here that O(n log n) storage units—n standing for the length of x—are sufficient to organize this second kind of statistics, in such a way that the maximum number of nonoverlapping instances for arbitrary w along x can be retrieved in a number of character comparisons not exceeding the length of w.
论文关键词:
论文评审过程:Received 9 September 1985, Revised 3 July 1986, Available online 2 December 2003.
论文官网地址:https://doi.org/10.1016/0022-0000(85)90060-1