EAR: Efficient action recognition with local-global temporal aggregation

作者:

Highlights:

• We propose an efficient action recognition network called EAR.

• EAR takes a joint view of both local and global aggregation for temporal modeling.

• An efficient motion cue called PA is designed to capture local motion boundaries.

• A modulation module called VA is proposed to capture global semantic hints.

• Experiments on six benchmarks demonstrate that EAR achieves competitive results.

摘要

Highlights•We propose an efficient action recognition network called EAR.•EAR takes a joint view of both local and global aggregation for temporal modeling.•An efficient motion cue called PA is designed to capture local motion boundaries.•A modulation module called VA is proposed to capture global semantic hints.•Experiments on six benchmarks demonstrate that EAR achieves competitive results.

论文关键词:Efficient action recognition,Local-global temporal aggregation,Motion representation,Persistence of appearance

论文评审过程:Received 3 August 2021, Revised 9 October 2021, Accepted 14 October 2021, Available online 25 October 2021, Version of Record 10 November 2021.

论文官网地址:https://doi.org/10.1016/j.imavis.2021.104329