Divergent-convergent attention for image captioning
作者:
Highlights:
• A novel divergent-convergent attention model (DCA) is proposed for image captioning.
• Fine-grained semantic and visual components are explored in divergent observation.
• Multimodal components are gradually converged with visual-semantic interaction.
• Divergent observation and convergent attention facilitate descriptive captions.
• DCA achieves state-of-the-art performance for image captioning on MS COCO.
摘要
•A novel divergent-convergent attention model (DCA) is proposed for image captioning.•Fine-grained semantic and visual components are explored in divergent observation.•Multimodal components are gradually converged with visual-semantic interaction.•Divergent observation and convergent attention facilitate descriptive captions.•DCA achieves state-of-the-art performance for image captioning on MS COCO.
论文关键词:Image Captioning,Divergent Observation,Convergent Attention
论文评审过程:Received 3 April 2020, Revised 14 September 2020, Accepted 1 March 2021, Available online 9 March 2021, Version of Record 19 March 2021.
论文官网地址:https://doi.org/10.1016/j.patcog.2021.107928