Pose-Guided Inflated 3D ConvNet for action recognition in videos

作者:

Highlights:

• Human action recognition in video is easily affected by complex background.

• Human pose is a high-level representation of human motion.

• Human pose features are used to capture the subtle cues of motion.

• Fusion of human pose, RGB and optical flow improves the performance of action recognition.

摘要

•Human action recognition in video is easily affected by complex background.•Human pose is a high-level representation of human motion.•Human pose features are used to capture the subtle cues of motion.•Fusion of human pose, RGB and optical flow improves the performance of action recognition.

论文关键词:Action recognition,Pose estimation,Spatial–temporal information,Feature fusion

论文评审过程:Received 18 January 2020, Revised 15 August 2020, Accepted 6 December 2020, Available online 9 December 2020, Version of Record 13 December 2020.

论文官网地址:https://doi.org/10.1016/j.image.2020.116098