RGB-T salient object detection via CNN feature and result saliency map fusion

作者：Chang Xu, Qingwu Li, Mingyu Zhou, Qingkai Zhou, Yaqin Zhou, Yunpeng Ma

摘要

Thermal infrared sensors have unique advantages under the conditions of insufficient illumination, complex scenarios, or occluded appearances. RGB-T salient object detection methods integrate the complementary advantages of visual and thermal modalities to capture salient objects more accurately. Considering the characteristics of visual and thermal images, we combine CNN feature and result saliency map fusion methods to achieve RGB-T salient object detection. First, a two-stream encoder-decoder network is proposed to handle the different saliency cues within RGB-T images. Specifically, the global attention module introduces the complementary saliency cues within thermal images to visual images, thereby ensuring the consistency of salient object locations. Subsequently, the two-stream decoder module gradually fuses the high-level salient object location cues with low-level detail saliency cues to obtain single-modality saliency maps. Then, saliency maps are fused and refined by the proposed result saliency map fusion method to achieve the final saliency map with high precision. In this way, the salient object is segmented with the fine boundary, and the noise inside the salient object is effectively suppressed. Experimental results demonstrate the effectiveness of each component within the CNN feature and result saliency map fusion methods. The proposed method facilitates desirable complementation for RGB-T images and performs favorably against state-of-the-art methods, especially in the challenges of low illumination, cluttered background, and low contrast.

论文关键词：RGB-T salient object detection, CNN attention module, Encoder-decoder network, Result fusion, Bezier interpolation

论文评审过程：

论文官网地址：https://doi.org/10.1007/s10489-021-02984-1