A language model using variable length tokens for open-vocabulary Hangul text recognition

作者:

Highlights:

摘要

We propose a novel language model for Hangul text recognition. Without relying on prior linguistic knowledge in training, the proposed model learns variable length Hangul character sequences, which comprise the elementary tokens of Korean language, and their probabilities from statistics of a raw text corpus. Experiments in handwritten Hangul recognition shows that the proposed language model is effective in postprocessing of recognition results.

论文关键词:Language model,Character recognition,Hangul recognition,Open-vocabulary,Word recognition

论文评审过程:Received 21 November 2003, Accepted 1 December 2003, Available online 25 February 2004.

论文官网地址:https://doi.org/10.1016/j.patcog.2003.12.004