EML-NET: An Expandable Multi-Layer NETwork for saliency prediction

作者：

Highlights：

•

摘要

Existing advanced saliency systems have been proposed using Convolutional Neural Networks (CNNs), the prior knowledge of objectness is crucial for saliency detection. However, we show that the use of objectness may also limit the power of CNNs on the images which do not contain a salient object. Besides, one previous study has shown applying a deeper CNN model may not improve the performance, the reason behind could be due to the limited training data. In this work, we aim at investigating the effect of prior knowledge from other domains and proposing a multi-modality system. Our work shows the depth of the model still plays an important role in saliency detection when the training data is large enough. To do this in a sophisticated manner can be complex, and also result in unwieldy networks or produce competing objectives that are hard to balance. For the scalability, our multi-modality system is trained in an almost end-to-end piece-wise fashion. The encoder and decoder components are separately trained to deal with complexity tied to the computational paradigm and required space. Therefore, our system can be easily extended to include a variety of prior knowledge for saliency detection. Besides, we also propose a combined saliency loss based on modifications of Pearson correlation and normalized scanpath saliency. Our experiment shows the combined loss can train a CNN model more comprehensively for saliency detection. We denote our expandable multi-layer network as EML-NET and our method achieves the state-of-the-art results on the public saliency benchmarks, SALICON, MIT300 and CAT2000.

论文关键词：Saliency detection,Scalability,Loss function

论文评审过程：Received 19 January 2020, Accepted 26 January 2020, Available online 3 February 2020, Version of Record 6 March 2020.

论文官网地址：https://doi.org/10.1016/j.imavis.2020.103887