Pose-Guided Inflated 3D ConvNet for action recognition in videos
作者:
Highlights:
• Human action recognition in video is easily affected by complex background.
• Human pose is a high-level representation of human motion.
• Human pose features are used to capture the subtle cues of motion.
• Fusion of human pose, RGB and optical flow improves the performance of action recognition.
摘要
•Human action recognition in video is easily affected by complex background.•Human pose is a high-level representation of human motion.•Human pose features are used to capture the subtle cues of motion.•Fusion of human pose, RGB and optical flow improves the performance of action recognition.
论文关键词:Action recognition,Pose estimation,Spatial–temporal information,Feature fusion
论文评审过程:Received 18 January 2020, Revised 15 August 2020, Accepted 6 December 2020, Available online 9 December 2020, Version of Record 13 December 2020.
论文官网地址:https://doi.org/10.1016/j.image.2020.116098