Multi-stream CNN: Learning representations based on human-related regions for action recognition
作者:
Highlights:
• Presenting a multi-stream CNN architecture to incorporate multiple complementary features trained in appearance and motion networks.
• Demonstrating that using full-frame, human body, and motion-salient body part regions together is effective to improve recognition performance.
• Proposing methods to detect the actor and motion-salient body part precisely.
• Verifying that high-quality flow is critically important to learn accurate video representations for action recognition.
摘要
•Presenting a multi-stream CNN architecture to incorporate multiple complementary features trained in appearance and motion networks.•Demonstrating that using full-frame, human body, and motion-salient body part regions together is effective to improve recognition performance.•Proposing methods to detect the actor and motion-salient body part precisely.•Verifying that high-quality flow is critically important to learn accurate video representations for action recognition.
论文关键词:Convolutional Neural Network,Action recognition,Multi-Stream,Motion salient region
论文评审过程:Received 13 May 2017, Revised 10 January 2018, Accepted 24 January 2018, Available online 10 February 2018, Version of Record 10 February 2018.
论文官网地址:https://doi.org/10.1016/j.patcog.2018.01.020