Tree-based data augmentation and mutual learning for offline handwritten mathematical expression recognition

作者:

Highlights:

• We propose a tree-based multi-level data augmentation strategy to effectively alleviate the problem of insufficient original annotation data, which is one of the critical technology to our champion system for the OffRaSHME20 competition.

• We introduce a novel tree-based mutual learning method to deeply integrate the string decoder and the tree decoder in both the training and inference stages, which fully complement the advantages of these two types of decoders.

• Our system significantly outperforms the other state-of-the-art results on both the OffRaSHME20 dataset and the CROHME14/16/19 datasets.

摘要

•We propose a tree-based multi-level data augmentation strategy to effectively alleviate the problem of insufficient original annotation data, which is one of the critical technology to our champion system for the OffRaSHME20 competition.•We introduce a novel tree-based mutual learning method to deeply integrate the string decoder and the tree decoder in both the training and inference stages, which fully complement the advantages of these two types of decoders.•Our system significantly outperforms the other state-of-the-art results on both the OffRaSHME20 dataset and the CROHME14/16/19 datasets.

论文关键词:Tree-based data augmentation,Tree-based mutual learning,Encoder-decoder,Offline handwritten mathematical expression recognition

论文评审过程:Received 20 August 2021, Revised 13 May 2022, Accepted 16 July 2022, Available online 19 July 2022, Version of Record 1 August 2022.

论文官网地址:https://doi.org/10.1016/j.patcog.2022.108910