Out of vocabulary word detection and recovery in Arabic handwritten text recognition
作者:
Highlights:
• A novel two-step OOV words detection and recovery method is proposed.
• The proposed method is generic and independent of the recognition engine.
• The proposed method uses various sub-lexical modeling to improve the detection step.
• The recovery process relies on dynamic lexicons built from large text corpora.
• The proposed method significantly improves the recognition results.
摘要
•A novel two-step OOV words detection and recovery method is proposed.•The proposed method is generic and independent of the recognition engine.•The proposed method uses various sub-lexical modeling to improve the detection step.•The recovery process relies on dynamic lexicons built from large text corpora.•The proposed method significantly improves the recognition results.
论文关键词:Arabic Handwriting recognition,Out of vocabulary detection and recovery,Static lexicon,Dynamic lexicon,Statistical language model,Deep learning,Multi-dimensional long short term memory network
论文评审过程:Received 10 July 2018, Revised 18 April 2019, Accepted 1 May 2019, Available online 1 May 2019, Version of Record 10 May 2019.
论文官网地址:https://doi.org/10.1016/j.patcog.2019.05.003