Exploring region relationships implicitly: Image captioning with visual relationship attention
作者:
Highlights:
• A novel visual relationship attention that explores visual relationships between regions in an implicit way.
• A multi-head parallel attention models can implicitly explore visual relationships under the spatial constraints.
• Visual relationship attention concentrates on relationship-level alignment between caption words and visual regions.
摘要
•A novel visual relationship attention that explores visual relationships between regions in an implicit way.•A multi-head parallel attention models can implicitly explore visual relationships under the spatial constraints.•Visual relationship attention concentrates on relationship-level alignment between caption words and visual regions.
论文关键词:Image captioning,Visual relationship attention,Relationship-level attention parallel attention mechanism,Learned spatial constraint
论文评审过程:Received 12 June 2020, Revised 24 November 2020, Accepted 16 February 2021, Available online 5 March 2021, Version of Record 16 March 2021.
论文官网地址:https://doi.org/10.1016/j.imavis.2021.104146