Asymmetric 3D Convolutional Neural Networks for action recognition

作者:

Highlights:

• We propose asymmetric one-directional 3D convolutions to approximate the traditional 3D convolution. The asymmetric 3D convolutions decrease parameters and computational cost significantly.

• To improve the feature learning capacity of asymmetric 3D convolutional layers, we propose the local 3D convolutional networks, MicroNets, which incorporate multi-scale 3D convolutional branches to handle the different scales convolutional features in videos.

• Based on the MicroNets, we design asymmetric 3D convolutional deep model which outperforms the tradition 3D-CNN models on both effectiveness and efficiency.

• We propose the multi-sources enhanced input to decrease the computational cost further by avoiding training two deep networks individually.Based on the above technical innovations, Our model outperforms all the tra- ditional 3D-CNN models in both effectiveness and efficiency, and is comparable with the recent state-of-the-art action recognition methods on two of the most challenging benchmarks, UCF-101 and HMDB-51 datasets.

摘要

•We propose asymmetric one-directional 3D convolutions to approximate the traditional 3D convolution. The asymmetric 3D convolutions decrease parameters and computational cost significantly.•To improve the feature learning capacity of asymmetric 3D convolutional layers, we propose the local 3D convolutional networks, MicroNets, which incorporate multi-scale 3D convolutional branches to handle the different scales convolutional features in videos.•Based on the MicroNets, we design asymmetric 3D convolutional deep model which outperforms the tradition 3D-CNN models on both effectiveness and efficiency.•We propose the multi-sources enhanced input to decrease the computational cost further by avoiding training two deep networks individually.Based on the above technical innovations, Our model outperforms all the tra- ditional 3D-CNN models in both effectiveness and efficiency, and is comparable with the recent state-of-the-art action recognition methods on two of the most challenging benchmarks, UCF-101 and HMDB-51 datasets.

论文关键词:Asymmetric 3D convolution,MicroNets,3D-CNN,Action recognition

论文评审过程:Received 1 October 2017, Revised 2 June 2018, Accepted 22 July 2018, Available online 24 July 2018, Version of Record 6 August 2018.

论文官网地址:https://doi.org/10.1016/j.patcog.2018.07.028