Cross-modal knowledge reasoning for knowledge-based visual question answering
作者:
Highlights:
• Using multiple knowledge graphs from the visual, semantic and factual views to depict the multimodal knowledge.
• A memory-based recurrent model for multi-step knowledge reasoning over graphstructured multimodal knowledge.
• Good interpretability to reveal the knowledge selection mode from different modalities.
• Significant improvement over state-of-the-art approaches on three benchmark datasets.
摘要
•Using multiple knowledge graphs from the visual, semantic and factual views to depict the multimodal knowledge.•A memory-based recurrent model for multi-step knowledge reasoning over graphstructured multimodal knowledge.•Good interpretability to reveal the knowledge selection mode from different modalities.•Significant improvement over state-of-the-art approaches on three benchmark datasets.
论文关键词:Cross-modal knowledge reasoning,Multimodal knowledge graphs,Compositional reasoning module,Knowledge-based visual question answering,Explainable reasoning
论文评审过程:Received 12 March 2020, Revised 13 May 2020, Accepted 21 July 2020, Available online 22 July 2020, Version of Record 27 July 2020.
论文官网地址:https://doi.org/10.1016/j.patcog.2020.107563