SCUT-HCCDoc: A new benchmark dataset of handwritten Chinese text in unconstrained camera-captured documents

作者:

Highlights:

• We propose a new large-scale SCUT-HCCDoc dataset for handwritten Chinese text detection, recognition and spotting in natural images.

• We statistically analyzed the samples and annotations of SCUT- HCCDoc in image level, text level and character level.

• We use state-of-the-art methods for baseline evaluation of text line detection, text line recognition, and end-to-end text spotting.

• A comprehensive analysis of the challenge of the dataset and benchmark testing are provided.

摘要

•We propose a new large-scale SCUT-HCCDoc dataset for handwritten Chinese text detection, recognition and spotting in natural images.•We statistically analyzed the samples and annotations of SCUT- HCCDoc in image level, text level and character level.•We use state-of-the-art methods for baseline evaluation of text line detection, text line recognition, and end-to-end text spotting.•A comprehensive analysis of the challenge of the dataset and benchmark testing are provided.

论文关键词:Document analysis and recognition,Handwritten Chinese text recognition,Handwritten Chinese text detection,Benchmark dataset

论文评审过程:Received 2 November 2019, Revised 24 March 2020, Accepted 20 July 2020, Available online 28 July 2020, Version of Record 1 August 2020.

论文官网地址:https://doi.org/10.1016/j.patcog.2020.107559