Multi-type decision fusion network for visual Q&A
作者:
Highlights:
• We divide questions into object- and relation-level,which depend on different information for answering
• We present Multi-Type Decision Fusion Network,which involves two networks to answer object- and relation-level questions
• We propose Balance Gated Network to balance the contribution of the two networks and output the final predication
• We achieve competing performance against the state-of-the-art methods on VQA-CP v2,VQA-v2 and VG
摘要
•We divide questions into object- and relation-level,which depend on different information for answering•We present Multi-Type Decision Fusion Network,which involves two networks to answer object- and relation-level questions•We propose Balance Gated Network to balance the contribution of the two networks and output the final predication•We achieve competing performance against the state-of-the-art methods on VQA-CP v2,VQA-v2 and VG
论文关键词:Visual question answering,Multi-type question,Scene graph
论文评审过程:Received 30 October 2020, Revised 28 May 2021, Accepted 10 August 2021, Available online 24 August 2021, Version of Record 30 August 2021.
论文官网地址:https://doi.org/10.1016/j.imavis.2021.104281