Exploring region relationships implicitly: Image captioning with visual relationship attention

作者:

Highlights:

• A novel visual relationship attention that explores visual relationships between regions in an implicit way.

• A multi-head parallel attention models can implicitly explore visual relationships under the spatial constraints.

• Visual relationship attention concentrates on relationship-level alignment between caption words and visual regions.

摘要

•A novel visual relationship attention that explores visual relationships between regions in an implicit way.•A multi-head parallel attention models can implicitly explore visual relationships under the spatial constraints.•Visual relationship attention concentrates on relationship-level alignment between caption words and visual regions.

论文关键词:Image captioning,Visual relationship attention,Relationship-level attention parallel attention mechanism,Learned spatial constraint

论文评审过程:Received 12 June 2020, Revised 24 November 2020, Accepted 16 February 2021, Available online 5 March 2021, Version of Record 16 March 2021.

论文官网地址:https://doi.org/10.1016/j.imavis.2021.104146