Pose-Guided Inflated 3D ConvNet for action recognition in videos

作者：

Highlights：

• Human action recognition in video is easily affected by complex background.

• Human pose is a high-level representation of human motion.

• Human pose features are used to capture the subtle cues of motion.

• Fusion of human pose, RGB and optical flow improves the performance of action recognition.

摘要

•Human action recognition in video is easily affected by complex background.•Human pose is a high-level representation of human motion.•Human pose features are used to capture the subtle cues of motion.•Fusion of human pose, RGB and optical flow improves the performance of action recognition.

论文关键词：Action recognition,Pose estimation,Spatial–temporal information,Feature fusion

论文评审过程：Received 18 January 2020, Revised 15 August 2020, Accepted 6 December 2020, Available online 9 December 2020, Version of Record 13 December 2020.

论文官网地址：https://doi.org/10.1016/j.image.2020.116098