Multi-scale Multi-attention Network for Moiré Document Image Binarization

作者:

Highlights:

摘要

In this paper, we propose a Multi-scale Multi-attention Network (MsMa-Net) to binarize document images contaminated by moiré patterns from camera-captured screens. Given a polluted image, MsMa-Net first learns to distinguish clean features from contaminated ones at different spatial scales via a Multi-scale feature extraction submodule (Ms-sub). In this way, detailed text information could be preserved as much as possible. Meanwhile, moiré patterns could be purified preliminarily. Then, obtained multi-scale features are adaptively interweaved through a proposed Multi-attention submodule (Ma-sub) at the channel level, the spatial level, and the correlation level, respectively. By modelling such relationships among multi-scale features, Ma-sub can further highlight text contents and suppress moiré patterns for yielding clean demoiré document images. All the demoiré images flow to a proposed Binarization submodule (Bi-sub) to produce final high-quality binarized document images. Besides, considering the scarce data support for the moiré document image binarization task, we create a new Moiré Document Image (MoDI) dataset for training and evaluating the proposed model. Extensive experiments demonstrate that MsMa-Net achieves state-of-the-art performance over several available datasets and MoDI dataset.

论文关键词:Moiré patterns,Document Image Binarization,Multi-scale Multi-attention Network

论文评审过程:Received 14 January 2020, Revised 15 June 2020, Accepted 21 October 2020, Available online 4 November 2020, Version of Record 12 November 2020.

论文官网地址:https://doi.org/10.1016/j.image.2020.116046