An Arabic optical character recognition system using recognition-based segmentation

作者:

Highlights:

摘要

Optical character recognition (OCR) systems improve human–machine interaction and are widely used in many areas. The recognition of cursive scripts is a difficult task as their segmentation suffers from serious problems. This paper proposes an Arabic OCR system, which uses a recognition-based segmentation technique to overcome the classical segmentation problems. A newly developed Arabic word segmentation algorithm is also introduced to separate horizontally overlapping Arabic words/subwords. There is also a feedback loop to control the combination of character fragments for recognition. The system was implemented and the results show a 90% recognition accuracy with a 20 chars/s recognition rate.

论文关键词:Cursive script,Word segmentation,Character fragmentation,Recognition,OCR

论文评审过程:Received 15 December 1998, Revised 16 November 1999, Accepted 16 November 1999, Available online 7 June 2001.

论文官网地址:https://doi.org/10.1016/S0031-3203(99)00227-7