CNN-Based RGB-D Salient Object Detection: Learn, Select, and Fuse
作者:Hao Chen, Youfu Li, Yongjian Deng, Guosheng Lin
摘要
The goal of this work is to present a systematic solution for RGB-D salient object detection, which addresses the following three aspects with a unified framework: modal-specific representation learning, complementary cue selection, and cross-modal complement fusion. To learn discriminative modal-specific features, we propose a hierarchical cross-modal distillation scheme, in which we use the progressive predictions from the well-learned source modality to supervise learning feature hierarchies and inference in the new modality. To better select complementary cues, we formulate a residual function to incorporate complements from the paired modality adaptively. Furthermore, a top-down fusion structure is constructed for sufficient cross-modal cross-level interactions. The experimental results demonstrate the effectiveness of the proposed cross-modal distillation scheme in learning from a new modality, the advantages of the proposed multi-modal fusion pattern in selecting and fusing cross-modal complements, and the generalization of the proposed designs in different tasks.
论文关键词:RGB-D, Salient object detection, Convolutional neural network, Cross-modal distillation
论文评审过程:
论文官网地址:https://doi.org/10.1007/s11263-021-01452-0