Segmentation of touching and fused Devanagari characters
作者:
Highlights:
•
摘要
Devanagari script is a two dimensional composition of symbols. It is highly cumbersome to treat each composite character as a separate atomic symbol because such combinations are very large in number. This paper presents a two pass algorithm for the segmentation and decomposition of Devanagari composite characters/symbols into their constituent symbols. The proposed algorithm extensively uses structural properties of the script. In the first pass, words are segmented into easily separable characters/composite characters. Statistical information about the height and width of each separated box is used to hypothesize whether a character box is composite. In the second pass, the hypothesized composite characters are further segmented. A recognition rate of 85 percent has been achieved on the segmented conjuncts. The algorithm is designed to segment a pair of touching characters.
论文关键词:Devanagari script,Character/text recognition,Prototype construction,Character fusion,Character fragmentation,Character segmentation/decomposition
论文评审过程:Received 9 December 1997, Accepted 10 April 2001, Available online 17 December 2001.
论文官网地址:https://doi.org/10.1016/S0031-3203(01)00081-4