Deformable attention-oriented feature pyramid network for semantic segmentation
作者:
Highlights:
•
摘要
In the field of computer vision, the use of pyramid features can significantly improve network performance. However, the misalignment of semantic information and the scale limitation of small-scale features lead to an imbalance of feature contributions, which severely limits the performance of the feature pyramid network. In order to solve the problem of model efficiency decline caused by feature contribution imbalance, in this paper, we propose a deformable attention-oriented feature pyramid network (DAFPN). Unlike previous models, which focus solely on the semantic information between features, DAFPN uses the deformable attention mechanism to model the relationship between multiple features and then merges them in the pyramid feature fusion process. Based on DAFPN, we further propose a fully transformer-based semantic segmentation head, which achieves high performance and good scalability. Comparisons on multiple backbones reveal that our proposed model outperforms the baseline model. Under the same conditions, our method can improve the mIoU by 14%, which is higher than the baseline semantic segmentation model.
论文关键词:Semantic segmentation,Feature fusion,Self-attention
论文评审过程:Received 14 April 2022, Revised 21 July 2022, Accepted 4 August 2022, Available online 19 August 2022, Version of Record 26 August 2022.
论文官网地址:https://doi.org/10.1016/j.knosys.2022.109623