Multi-scale visual attention for attribute disambiguation in zero-shot learning

作者:

Highlights:

• We propose an embedding-based attentional model for zero-shot image recognition.

• Our method combines multi-scale visual attention (VA) and attribute selection (AS).

• The two parts, VA and AS, are optimized in a unified supervised framework.

• Our method can discover the relationships between visual and semantic spaces.

• Experimental results show our method performs better than other related methods.

摘要

•We propose an embedding-based attentional model for zero-shot image recognition.•Our method combines multi-scale visual attention (VA) and attribute selection (AS).•The two parts, VA and AS, are optimized in a unified supervised framework.•Our method can discover the relationships between visual and semantic spaces.•Experimental results show our method performs better than other related methods.

论文关键词:Zero-shot image recognition,Visual attention,Attribute disambiguation

论文评审过程:Received 27 March 2021, Revised 6 November 2021, Accepted 20 December 2021, Available online 4 January 2022, Version of Record 19 January 2022.

论文官网地址:https://doi.org/10.1016/j.image.2021.116614