GAN for vision, KG for relation: A two-stage network for zero-shot action recognition

作者:

Highlights:

• We propose a two-stage network for zero-shot action recognition: FGGA, which joints feature generation network and graph attention network, makes a comprehensive analysis from sample level and classifier level.

• FGGA adopts the conditional Wasserstein GAN with additional loss terms to train generator, which can transform the representation space of actions from the word vector space to the visual feature space, and the synthesized features of unseen class can be straightforwardly fed to typical classifiers.

• FGGA integrates an attention mechanism with GCN, and expresses the relationship between action class and related objects dynamically.

摘要

•We propose a two-stage network for zero-shot action recognition: FGGA, which joints feature generation network and graph attention network, makes a comprehensive analysis from sample level and classifier level.•FGGA adopts the conditional Wasserstein GAN with additional loss terms to train generator, which can transform the representation space of actions from the word vector space to the visual feature space, and the synthesized features of unseen class can be straightforwardly fed to typical classifiers.•FGGA integrates an attention mechanism with GCN, and expresses the relationship between action class and related objects dynamically.

论文关键词:Action recognition,Zero-shot learning,Generative adversarial networks,Graph convolution network

论文评审过程:Received 11 February 2020, Revised 22 January 2022, Accepted 29 January 2022, Available online 3 February 2022, Version of Record 6 February 2022.

论文官网地址:https://doi.org/10.1016/j.patcog.2022.108563