Text information extraction in images and video: a survey

作者：

Highlights：

•

摘要

Text data present in images and video contain useful information for automatic annotation, indexing, and structuring of images. Extraction of this information involves detection, localization, tracking, extraction, enhancement, and recognition of the text from a given image. However, variations of text due to differences in size, style, orientation, and alignment, as well as low image contrast and complex background make the problem of automatic text extraction extremely challenging. While comprehensive surveys of related problems such as face detection, document analysis, and image & video indexing can be found, the problem of text information extraction is not well surveyed. A large number of techniques have been proposed to address this problem, and the purpose of this paper is to classify and review these algorithms, discuss benchmark data and performance evaluation, and to point out promising directions for future research.

论文关键词：Text information extraction,Text detection,Text localization,Text tracking,Text enhancement,OCR

论文评审过程：Received 3 April 2003, Revised 28 October 2003, Accepted 28 October 2003, Available online 21 January 2004.

论文官网地址：https://doi.org/10.1016/j.patcog.2003.10.012