Fast algorithms for finding disjoint subsequences with extremal densities

作者:

Highlights:

摘要

We derive fast algorithms for the following problem: given a set of n points on the real line and two parameters s and p, find s disjoint intervals of maximum total length that contain at most p of the given points. Our main contribution consists of algorithms whose time bounds improve upon a straightforward dynamic programming algorithm, in the relevant case that input size n is much bigger than parameters s and p. These results are achieved by selecting a few candidate intervals that are provably sufficient for building an optimal solution via dynamic programming. As a byproduct of this idea we improve an algorithm for a similar subsequence problem of Chen et al. [Disjoint segments with maximum density, in: International Workshop on Bioinformatics Research and Applications IWBRA 2005, (within ICCS 2005), Lecture Notes in Computer Science, vol. 3515, Springer, Berlin, pp. 845–850]. The problems are motivated by the search for significant patterns in biological data. Finally, we propose several heuristics that further reduce the time complexity in typical instances. One of them leads to an apparently open subsequence sum problem of independent interest.

论文关键词:Holes in data,Range prediction,Protein torsion angle,Protein structure prediction,Dynamic programming,Selection algorithms,Time complexity

论文评审过程:Received 6 July 2005, Revised 3 January 2006, Accepted 19 January 2006, Available online 3 March 2006.

论文官网地址:https://doi.org/10.1016/j.patcog.2006.01.008