A motion-aware ConvLSTM network for action recognition

作者：Mahshid Majd, Reza Safabakhsh

摘要

Human action recognition is an emerging goal of computer vision with several applications such as video surveillance and human-computer interaction. Despite many attempts to develop deep architectures to learn the spatio-temporal features of video, hand-crafted optical flow is still an important part of the recognition process. To engage the motion features deeply inside the learning process, we propose a spatio-temporal video recognition network where a motion-aware long short-term memory module is introduced to estimate the motion flow along with extracting spatio-temporal features. A specific optical flow estimator is subsumed which is based on kernelized cross correlation. The proposed network can be used without any extra learning process and there is no need to pre-compute and store the optical flow. Extensive experiments on two action recognition benchmarks verify the effectiveness of the proposed approach.

论文关键词：Human action recognition, Deep learning, Convolutional networks, LSTM, ConvLSTM

论文评审过程：

论文官网地址：https://doi.org/10.1007/s10489-018-1395-8