Linear time algorithms for finding and representing all the tandem repeats in a string

作者:

Highlights:

摘要

A tandem repeat (or square) is a string αα, where α is a non-empty string. We present an O(|S|)-time algorithm that operates on the suffix tree T(S) for a string S, finding and marking the endpoint in T(S) of every tandem repeat that occurs in S. This decorated suffix tree implicitly represents all occurrences of tandem repeats in S, and can be used to efficiently solve many questions concerning tandem repeats and tandem arrays in S. This improves and generalizes several prior efforts to efficiently capture large subsets of tandem repeats.

论文关键词:

论文评审过程:Received 21 February 2003, Revised 19 December 2003, Available online 6 May 2004.

论文官网地址:https://doi.org/10.1016/j.jcss.2004.03.004