CNN-Based RGB-D Salient Object Detection: Learn, Select, and Fuse

作者：Hao Chen, Youfu Li, Yongjian Deng, Guosheng Lin

摘要

The goal of this work is to present a systematic solution for RGB-D salient object detection, which addresses the following three aspects with a unified framework: modal-specific representation learning, complementary cue selection, and cross-modal complement fusion. To learn discriminative modal-specific features, we propose a hierarchical cross-modal distillation scheme, in which we use the progressive predictions from the well-learned source modality to supervise learning feature hierarchies and inference in the new modality. To better select complementary cues, we formulate a residual function to incorporate complements from the paired modality adaptively. Furthermore, a top-down fusion structure is constructed for sufficient cross-modal cross-level interactions. The experimental results demonstrate the effectiveness of the proposed cross-modal distillation scheme in learning from a new modality, the advantages of the proposed multi-modal fusion pattern in selecting and fusing cross-modal complements, and the generalization of the proposed designs in different tasks.

论文关键词：RGB-D, Salient object detection, Convolutional neural network, Cross-modal distillation

论文评审过程：

论文官网地址：https://doi.org/10.1007/s11263-021-01452-0