Pattern matching with don't cares and few errors

作者:

Highlights:

摘要

We present solutions for the k-mismatch pattern matching problem with don't cares. Given a text t of length n and a pattern p of length m with don't care symbols and a bound k, our algorithms find all the places that the pattern matches the text with at most k mismatches. We first give a Θ(n(k+logmlogk)logn) time randomised algorithm which finds the correct answer with high probability. We then present a new deterministic Θ(nk2log2m) time solution that uses tools originally developed for group testing. Taking our derandomisation approach further we develop an approach based on k-selectors that runs in Θ(nkpolylogm) time. Further, in each case the location of the mismatches at each alignment is also given at no extra cost.

论文关键词:Pattern matching,String algorithms,Randomised algorithms,Group testing

论文评审过程:Received 14 August 2008, Revised 19 May 2009, Available online 11 June 2009.

论文官网地址:https://doi.org/10.1016/j.jcss.2009.06.002