3D semantic segmentation based on spatial-aware convolution and shape completion for augmented reality applications

作者:

Highlights:

摘要

3D semantic segmentation of indoor scenes is a popular research topic in the field of computer vision. For many applications, it is very important to know exactly what category each point in the scene belongs to. Benefiting from the development of deep learning, many neural networks based on voxels and points have been proposed to solve these segmentation problems. However, most of them do not fully consider the information of the spatial structure. Current voxel-based sparse convolutional neural networks can effectively extract 3D features in space. However, they assume that the feature in the empty space is zero, causing a loss of information in the spatial structure. In this paper, we propose a system that uses scene point clouds with color information to semantically segment an entire indoor scene. Based on the sparsity of spatial data, we design a novel spatial-aware sparse convolution operation. We encode the spatial information of the object’s existence as an additional feature and use the self-attention mechanism to effectively aggregate features. In addition, we introduce a completion network to refine the results from the segmentation network, so that each object in the scene is fitted into a more reasonable and complete shape. Through the above two methods, we build an accurate scene semantic segmentation network to obtain the semantic information of the entire scene. In the experimental part, we use two public datasets to perform quantitative and qualitative analysis. We compare our results with other state-of-the-art methods to prove the superiority of our method. Our models are also examined under different configurations to assure the effectiveness of the proposed method. Finally, the semantic segmentation model was integrated into a real-world application to demonstrate its usefulness. We expect that the proposed 3D scene semantic segmentation system can provide accurate and fast results for practical applications.

论文关键词:

论文评审过程:Received 7 April 2021, Revised 24 August 2022, Accepted 29 August 2022, Available online 5 September 2022, Version of Record 20 September 2022.

论文官网地址:https://doi.org/10.1016/j.cviu.2022.103550