Visual text recognition through contextual processing

作者:

Highlights:

摘要

In most of the works on contextual processing using a dictionary it is assumed that all the words of the document lie within the dictionary. It is also assumed that a clear word boundary exists. In practice neither is true.In this work we present a two pass contextual processing algorithm limited to the word level using a partial dictionary with an augmented dictionary approach, modified Viterbi algorithm and some heuristics based on pragmatic features. A character confusion matrix obtained through training and weighted with respect to dictionary words within the document is used to generate aliases for the input word. It has been tested with an omnifont character recogniser on documents of varying types. The overall performance of our system exceeds 98% correct character recognition (97% word recognition) which is better than that of other reported works.

论文关键词:Text recognition,Character recognition,Contextual processing

论文评审过程:Received 5 June 1987, Revised 19 November 1987, Available online 19 May 2003.

论文官网地址:https://doi.org/10.1016/0031-3203(88)90006-4