Deep structural information fusion for 3D object detection on LiDAR–camera system

作者：

Highlights：

•

摘要

3D object detection on LiDAR–camera system is a challenging task, for 3D LiDAR point and 2D RGB image have different data representation. In this paper, We consider that the geometrical consistency in the local 3D and 2D regions is helpful for the regression task in 3D object detection, and propose 3D–2D consistent feature. It is based on hand-crafted 3D and 2D descriptors, generates primary structure feature, and has stable performance in outdoor scenes. Considering that material feature can be used to distinguish different objects, material coefficients ratio (MCR) is proposed to generate primary semantic feature, benefiting the classification task in 3D object detection. It is based on Lambertian model. To take advantage of both 3D–2D consistent feature and MCR, we propose deep 3D–2D structural information fusion (SIF) for 3D object detection. It provides attentional structural voxel feature, used as the input of LiDAR voxel based 3D object detectors. SIF is a light, effective, and explainable module. In the outdoor 3D object detection dataset, extensive experiments demonstrate that SIF improves the performance for both LiDAR voxel based single stage and multi-stage 3D detectors.

论文关键词：

论文评审过程：Received 25 February 2021, Revised 20 August 2021, Accepted 4 October 2021, Available online 9 October 2021, Version of Record 19 November 2021.

论文官网地址：https://doi.org/10.1016/j.cviu.2021.103295