Region-sequence based six-stream CNN features for general and fine-grained human action recognition in videos

作者：

Highlights：

• We proposed a new method for human fine-grained action recognition in videos.

• Our method uses a coarse pose estimation method to cut video frames and get human body foreground patch sequence.

• Our method focuses on human lower arm area to enhance effective pixels for fine-grained actions.

• We propose an encoding method to process the last pooling layer features of CNN structure.

摘要

•We proposed a new method for human fine-grained action recognition in videos.•Our method uses a coarse pose estimation method to cut video frames and get human body foreground patch sequence.•Our method focuses on human lower arm area to enhance effective pixels for fine-grained actions.•We propose an encoding method to process the last pooling layer features of CNN structure.

论文关键词：Human pose,Action recognition,Video understanding

论文评审过程：Received 4 May 2017, Revised 31 October 2017, Accepted 19 November 2017, Available online 21 November 2017, Version of Record 21 December 2017.

论文官网地址：https://doi.org/10.1016/j.patcog.2017.11.026