Attention with structure regularization for action recognition

作者:

Highlights:

摘要

Recognizing human action in video is an important task with a wide range of applications. Recently, motivated by the findings in human visual perception, there have been numerous attempts on introducing attention mechanisms to action recognition systems. However, it is empirically observed that an implementation of attention mechanism using attention mask of free form often generates ineffective distracted attention regions caused by overfitting, which limits the benefit of attention mechanisms for action recognition. By exploiting block-structured sparsity prior on attention regions, this paper proposed an ℓ2,1-norm group sparsity regularization for learning structured attention masks. Built upon such a regularized attention module, an attention-based recurrent network is developed for action recognition. The experimental results on two benchmark datasets showed that, the proposed method can noticeably improve the accuracy of attention masks, which results in performance gain in action recognition.

论文关键词:

论文评审过程:Received 29 August 2018, Revised 7 May 2019, Accepted 6 August 2019, Available online 12 August 2019, Version of Record 4 September 2019.

论文官网地址:https://doi.org/10.1016/j.cviu.2019.102794