Action Transformer: A self-attention model for short-time pose-based human action recognition
作者:
Highlights:
• We study the application of the Transformer encoder to 2D pose-based HAR and propose the novel AcT model.
• We introduce MPOSE2021, a dataset for real-time short-time HAR. In contrast to other publicly available datasets, the peculiarity of having a constrained number of time steps stimulates the development of actual real-time methodologies that perform HAR with low latency and high throughput.
• We conduct extensive experimentation on model performance and latency to verify the suitability of AcT for real-time applications.
摘要
•We study the application of the Transformer encoder to 2D pose-based HAR and propose the novel AcT model.•We introduce MPOSE2021, a dataset for real-time short-time HAR. In contrast to other publicly available datasets, the peculiarity of having a constrained number of time steps stimulates the development of actual real-time methodologies that perform HAR with low latency and high throughput.•We conduct extensive experimentation on model performance and latency to verify the suitability of AcT for real-time applications.
论文关键词:Human action recognition,Deep learning,Computer vision,Transformer
论文评审过程:Received 2 August 2021, Revised 26 November 2021, Accepted 4 December 2021, Available online 15 December 2021, Version of Record 20 December 2021.
论文官网地址:https://doi.org/10.1016/j.patcog.2021.108487