Complementarity-aware cross-modal feature fusion network for RGB-T semantic segmentation

作者：

Highlights：

• We propose a Complementarity-aware Encoder, which can capture discriminative cross-modal features by modeling and fusing complementary information from RGB and thermal features.

• We design a Three-Path Fusion and Supervision to further fuse features from both modalities and supervise the training of our model.

• Our fully-equipped model (CCFFNet), achieves the new state-of-the-art performance in RGB-T semantic segmentation. And the CAE in our model is plug-and-play.

摘要

•We propose a Complementarity-aware Encoder, which can capture discriminative cross-modal features by modeling and fusing complementary information from RGB and thermal features.•We design a Three-Path Fusion and Supervision to further fuse features from both modalities and supervise the training of our model.•Our fully-equipped model (CCFFNet), achieves the new state-of-the-art performance in RGB-T semantic segmentation. And the CAE in our model is plug-and-play.

论文关键词：RGB-T,Cross-modal fusion,Multi-supervision,Semantic segmentation

论文评审过程：Received 22 March 2022, Revised 11 June 2022, Accepted 29 June 2022, Available online 7 July 2022, Version of Record 12 July 2022.

论文官网地址：https://doi.org/10.1016/j.patcog.2022.108881