Joint architecture and knowledge distillation in CNN for Chinese text recognition

作者:

Highlights:

• We propose a guideline to distill the architecture and knowledge of pre-trained standard CNNs simultaneously for fast compression and acceleration.

• The effectiveness is first verified on offline HCTR. Compared with the baseline CNN, the corresponding compact network can reduce the computational cost by >10×and model size by >8×with negligible accuracy loss.

• Furthermore, the proposed method is successfully used to reduce resource consumption of the mainstream backbone networks on CTW and MNIST.

摘要

•We propose a guideline to distill the architecture and knowledge of pre-trained standard CNNs simultaneously for fast compression and acceleration.•The effectiveness is first verified on offline HCTR. Compared with the baseline CNN, the corresponding compact network can reduce the computational cost by >10×and model size by >8×with negligible accuracy loss.•Furthermore, the proposed method is successfully used to reduce resource consumption of the mainstream backbone networks on CTW and MNIST.

论文关键词:Convolutional neural network,Acceleration and compression,Architecture and knowledge distillation,Offline handwritten Chinese text recognition

论文评审过程:Received 16 June 2020, Revised 10 October 2020, Accepted 23 October 2020, Available online 25 October 2020, Version of Record 30 October 2020.

论文官网地址:https://doi.org/10.1016/j.patcog.2020.107722