MuLTReNets: Multilingual text recognition networks for simultaneous script identification and handwriting recognition

作者:

Highlights:

• A novel multi-task system, named MuLTReNets, to optimize script identification and handwriting recognition jointly for multilingual handwritten text recognition.

• The MuLTReNets are extended into two versions: one for multi-lingual text recognition with merged alphabet (MuLTReNetV1), one for cascaded script identification and unilingual text recognition with joint training (MuLTReNetV2).

• Auto-weighter keeps the balance among datasets of different scripts.

• Performance is superior to cascade systems and unilingual recognition systems.

• Experimental analysis for better understanding the system.

摘要

•A novel multi-task system, named MuLTReNets, to optimize script identification and handwriting recognition jointly for multilingual handwritten text recognition.•The MuLTReNets are extended into two versions: one for multi-lingual text recognition with merged alphabet (MuLTReNetV1), one for cascaded script identification and unilingual text recognition with joint training (MuLTReNetV2).•Auto-weighter keeps the balance among datasets of different scripts.•Performance is superior to cascade systems and unilingual recognition systems.•Experimental analysis for better understanding the system.

论文关键词:Multrenets,Auto-weighter,Separable MDLSTM,Multilingual handwritten text recognition,Multi-task learning

论文评审过程:Received 21 October 2019, Revised 23 May 2020, Accepted 20 July 2020, Available online 23 July 2020, Version of Record 30 July 2020.

论文官网地址:https://doi.org/10.1016/j.patcog.2020.107555