GAR-Net: A Graph Attention Reasoning Network for conversation understanding

作者:

Highlights:

摘要

Conversation understanding, as a necessary step for many applications, including social media, education, and argumentation mining, has been gaining increasing attention from the research community. Reasoning over long-term dependent contextual information is the key to utterance-level conversation understanding. Aiming to emphasize the importance of contextual reasoning, an end-to-end graph attention reasoning network which takes both word-level and utterance-level context into concern is proposed. To be specific, a word-level encoder with a novel convolution gate is first built to filter out irrelevant contextual information. Based on the representation extracted by word-level encoder, a graph reasoning network is designed to utilize the context among utterance-level, where the entire conversation is treated as a fully connected graph, utterances as nodes, and attention scores between utterances as edges. The proposed model is a general framework for conversation understanding tasks, which can be flexibly applied on most conversation datasets without changing the network architecture. Furthermore, we revise the tensor fusion network by adding a residual connection to explore cross-modal conversation understanding. For uni-modal scene (text modality), experiments show that the proposed method surpasses current state-of-the-art methods on emotion recognition, intent classification, and dialogue act identification tasks. For cross-modal scenes (text modality and audio modality), experiments on IEMOCAP and MELD datasets show that the proposed method can use cross-modal information to improve model performance.2

论文关键词:Conversation understanding,Convolution gate,Graph reasoning,Cross-modal

论文评审过程:Received 20 February 2021, Revised 29 September 2021, Accepted 24 December 2021, Available online 6 January 2022, Version of Record 29 January 2022.

论文官网地址:https://doi.org/10.1016/j.knosys.2021.108055