Natural scene text detection and recognition based on saturation-incorporated multi-channel MSER

作者:

Highlights:

摘要

The detection and recognition of natural scene text is a crucial computer task for the automatic identification and classification of industrial resources with words or labels, and it has gained much attention recently because of its potential to boost production efficiency. Maximally Stable Extremal Region (MSER) is an effective detection method for the natural scene text. However, MSER and its variants have not adequately considered the color information like Red, Green, and Blue (RGB). To address this problem, we present a combined detection method by Canny edge detector focusing on the edge and the improved multi-channel MSER concentrating on the area, where the channels incorporated into MSER consist of R, G, and B channels of the RGB color space and the S channel of the Hue, Saturation and Intensity (HSI) color space. Such a multi-channel MSER method takes full consideration of the color information into the text detection without causing a significant increase in the computational cost. In addition, a text recognition system is presented on the basis of the proposed text detection method. The proposed method for text detection and recognition is evaluated on three datasets including the ICDAR2013, the ICDAR2015, and the ICDAR2017-MLT. Experimental results show that our method outperforms the state-of-the-art methods with the highest Precision of 89.4%, Recall of 93.7%, F-score of 91.5%, and about the shortest runtime of 0.76s on an image of 512 × 512 pixels.

论文关键词:Text detection and recognition,Canny edge detector,Multi-channel MSER

论文评审过程:Received 26 November 2021, Revised 10 May 2022, Accepted 10 May 2022, Available online 18 May 2022, Version of Record 30 May 2022.

论文官网地址:https://doi.org/10.1016/j.knosys.2022.109040