3D Point Convolutional Network for Dense Scene Flow Estimation

作者：Xuezhi Xiang, Rokia Abdein, Mingliang Zhai, Ning Lv

摘要

Scene flow estimation is one of the most crucial components of many scene understanding tasks, which represents the complete 3D motion of objects in dynamic scene. Most of existing scene flow estimation approaches are usually based on joint learning framework, which cast the problem of scene flow estimation to dense prediction of optical flow and stereo matching. However, these approaches need to reconstruct 3D motion from optical flow and disparity using 2D stereo image pairs, which leads the estimation process to be indirect. Recently, FlowNet3D attempts to learn scene flow from 3D point clouds, which adopts element-wise max-pooling to aggregate features from different points. Nevertheless, max-pooling operation only can obtain the strongest activation on features across a local or global region, which may increase the loss of some useful detailed and contextual. Specifically, for dense estimation task, the ability to transfer information gradually from coarse to fine layers is important. To address this problem, we investigate a new deep architecture, 3D point convolutional network, to learn scene flow from 3D point clouds. This specific architecture uses multi-layer perceptron (MLP) to approximate the weight function and applies a density scale to re-weight the learned weight functions for each convolutional filter, which can make the network permutation-invariant and translation invariant on 3D space which is beneficial for feature aggregation. Extensive experimental results are conducted on FlyingThings3D and KITTI scene flow datasets and show the effectiveness of our proposed approach. Our algorithm can achieve state-of-the-art performance on FlyingThings3D and KITTI datasets.

论文关键词：Scene flow estimation, 3D point clouds, Point convolutional network, Supervised deep learning

论文评审过程：

论文官网地址：https://doi.org/10.1007/s11063-021-10673-w