Optimizing the cost matrix for approximate string matching using genetic algorithms

作者:

Highlights:

摘要

This paper describes a method for optimizing the cost matrix of any approximate string matching algorithm based on the Levenshtein distance. The method, which uses genetic algorithms, defines the problem formally as a discrimination between a set of classes. It is tested and evaluated using both synthetically generated strings of symbols and chain code data extracted from the international Unipen database of on-line handwritten scripts. Experimental results show that this approach can effectively discover the hidden costs of elementary operations in a set of string classes. 1998 Pattern Recognition Society.

论文关键词:Approximate string matching,Cost matrix optimization,Genetic algorithms,Levenshtein distance,On-line handwriting recognition

论文评审过程:Received 20 November 1996, Revised 27 May 1997, Available online 7 June 2001.

论文官网地址:https://doi.org/10.1016/S0031-3203(97)00058-7