Visual question answering model based on visual relationship detection

作者：

Highlights：

• Judgment of the interrelations between the objects is added in traditional VQA visual tasks.

• The principle of word vector similarity is introduced in judgment of interrelations.

• Attention mechanism guided by problematic words is added to guide the attention to specific regions.

摘要

•Judgment of the interrelations between the objects is added in traditional VQA visual tasks.•The principle of word vector similarity is introduced in judgment of interrelations.•Attention mechanism guided by problematic words is added to guide the attention to specific regions.

论文关键词：Visual question answering,Appearance features,Relationship predicate,Word vector similarity

论文评审过程：Received 22 May 2019, Revised 26 August 2019, Accepted 17 September 2019, Available online 27 September 2019, Version of Record 3 October 2019.

论文官网地址：https://doi.org/10.1016/j.image.2019.115648