Spatio-temporal attention on manifold space for 3D human action recognition

作者：Chongyang Ding, Kai Liu, Fei Cheng, Evgeny Belyaev

摘要

Recently, skeleton-based action recognition has become increasingly prevalent in computer vision due to its wide range of applications, and many approaches have been proposed to address this task. Among these methods, manifold space is widely used to deal with the relative geometric relationships between different body parts in human skeletons. Existing studies treat all geometric relationships as having the same degree of importance; thus, they cannot focus on significant information. In addition, the traditional attention mechanism aims mostly to solve the attention problems in Euclidean space, and is not applicable in manifold space. To investigate these issues, we propose a spatial and temporal attention mechanism on Lie groups for 3D human action recognition. We build our network architecture with a generalized attention mechanism that extends the scope of attention from traditional Euclidean space to manifold space. In addition, our model can learn to identify the significant spatial features and temporal stages with effective attention modules, which focus on discriminative transformation relationships between different rigid bodies within each frame and allocate different levels of attention to different frames. Extensive experiments are conducted on standard datasets and the experimental results demonstrate the effectiveness of the proposed network architecture.

论文关键词：Skeleton-based, Action recognition, Spatial attention, Temporal attention, Manifold space

论文评审过程：

论文官网地址：https://doi.org/10.1007/s10489-020-01803-3