Divergent-convergent attention for image captioning

作者：

Highlights：

• A novel divergent-convergent attention model (DCA) is proposed for image captioning.

• Fine-grained semantic and visual components are explored in divergent observation.

• Multimodal components are gradually converged with visual-semantic interaction.

• Divergent observation and convergent attention facilitate descriptive captions.

• DCA achieves state-of-the-art performance for image captioning on MS COCO.

摘要

•A novel divergent-convergent attention model (DCA) is proposed for image captioning.•Fine-grained semantic and visual components are explored in divergent observation.•Multimodal components are gradually converged with visual-semantic interaction.•Divergent observation and convergent attention facilitate descriptive captions.•DCA achieves state-of-the-art performance for image captioning on MS COCO.

论文关键词：Image Captioning,Divergent Observation,Convergent Attention

论文评审过程：Received 3 April 2020, Revised 14 September 2020, Accepted 1 March 2021, Available online 9 March 2021, Version of Record 19 March 2021.

论文官网地址：https://doi.org/10.1016/j.patcog.2021.107928