mSODANet: A network for multi-scale object detection in aerial images using hierarchical dilated convolutions

作者:

Highlights:

• Proposed network for multi-scale object detection in aerial images using hierarchical dilated convolutions (mSODANet) is explored to detect the objects of various scales in the visual scene and capture the effective scene contextual information.

• Bi-directional feature aggregation module (BFAM) is leveraged to incorporate dense multi-scale contextual features.

• Proposed approach is demonstrated on three challenging aerial imagery datasets, namely, VisDrone2019, DOTA (OBB & HBB), and NWPU-VHR10.

摘要

•Proposed network for multi-scale object detection in aerial images using hierarchical dilated convolutions (mSODANet) is explored to detect the objects of various scales in the visual scene and capture the effective scene contextual information.•Bi-directional feature aggregation module (BFAM) is leveraged to incorporate dense multi-scale contextual features.•Proposed approach is demonstrated on three challenging aerial imagery datasets, namely, VisDrone2019, DOTA (OBB & HBB), and NWPU-VHR10.

论文关键词:Multi-scale object detection,Contextual features,Dilated convolutions,Aerial images

论文评审过程:Received 4 October 2021, Revised 22 December 2021, Accepted 21 January 2022, Available online 23 January 2022, Version of Record 29 January 2022.

论文官网地址:https://doi.org/10.1016/j.patcog.2022.108548