Multi-type decision fusion network for visual Q&A

作者:

Highlights:

• We divide questions into object- and relation-level,which depend on different information for answering

• We present Multi-Type Decision Fusion Network,which involves two networks to answer object- and relation-level questions

• We propose Balance Gated Network to balance the contribution of the two networks and output the final predication

• We achieve competing performance against the state-of-the-art methods on VQA-CP v2,VQA-v2 and VG

摘要

•We divide questions into object- and relation-level,which depend on different information for answering•We present Multi-Type Decision Fusion Network,which involves two networks to answer object- and relation-level questions•We propose Balance Gated Network to balance the contribution of the two networks and output the final predication•We achieve competing performance against the state-of-the-art methods on VQA-CP v2,VQA-v2 and VG

论文关键词:Visual question answering,Multi-type question,Scene graph

论文评审过程:Received 30 October 2020, Revised 28 May 2021, Accepted 10 August 2021, Available online 24 August 2021, Version of Record 30 August 2021.

论文官网地址:https://doi.org/10.1016/j.imavis.2021.104281