Learning motion representation for real-time spatio-temporal action localization
作者:
Highlights:
• Proposing a novel method to localize human actions in videos spatio-temporally with integrating an optical flow subnet. The designed new architecture is able to perform action localization and optical flow estimation jointly in an end-to-end manner.
• The interaction between the action detector and flow subnet enables the detector to learn parameters from appearance and motion simultaneously, and guiding flow subnet to compute task-specific optical flow.
• Exploiting an effective fusion method to fuse appearance and optical flow deep features in a multi-scale fashion. The multi-scale temporal and spatial features are combined interactively to model a more discriminative spatio-temporal action representation.
• The presented method achieves real-time computation at the first time with the usage of both RGB appearance and optical flow. It outperforms the state-of-the-art method [1] by 1.3% in accuracy.
摘要
•Proposing a novel method to localize human actions in videos spatio-temporally with integrating an optical flow subnet. The designed new architecture is able to perform action localization and optical flow estimation jointly in an end-to-end manner.•The interaction between the action detector and flow subnet enables the detector to learn parameters from appearance and motion simultaneously, and guiding flow subnet to compute task-specific optical flow.•Exploiting an effective fusion method to fuse appearance and optical flow deep features in a multi-scale fashion. The multi-scale temporal and spatial features are combined interactively to model a more discriminative spatio-temporal action representation.•The presented method achieves real-time computation at the first time with the usage of both RGB appearance and optical flow. It outperforms the state-of-the-art method [1] by 1.3% in accuracy.
论文关键词:Spatio-Temporal Action Localization,Real-time Computation,Optical Flow Sub-network,Pyramid Hierarchical Fusion
论文评审过程:Received 20 July 2019, Revised 17 February 2020, Accepted 24 February 2020, Available online 27 February 2020, Version of Record 11 March 2020.
论文官网地址:https://doi.org/10.1016/j.patcog.2020.107312