Emotional conversation generation with heterogeneous graph neural network

作者:

摘要

The successful emotional conversation system depends on sufficient perception and appropriate expression of emotions. In a real-life conversation, humans firstly instinctively perceive emotions from multi-source information, including the emotion flow hidden in dialogue history, facial expressions, audio, and personalities of speakers. Then, they convey suitable emotions according to their personalities, but these multiple types of information are insufficiently exploited in emotional conversation fields. To address this issue, in this paper, we propose a heterogeneous graph-based model for emotional conversation generation. Firstly, we design a Heterogeneous Graph-Based Encoder to represent the conversation content (i.e., the dialogue history, its emotion flow, facial expressions, audio, and speakers' personalities) with a heterogeneous graph neural network, and then predict suitable emotions for feedback. Secondly, we employ an Emotion-Personality-Aware Decoder to generate a response relevant to the conversation context as well as with appropriate emotions, through taking the encoded graph representations, the predicted emotions by the encoder and the personality of the current speaker as inputs. Experiments on both automatic and human evaluation show that our method can effectively perceive emotions from multi-source knowledge and generate a satisfactory response. Furthermore, based on the up-to-date text generator BART, our model still can achieve consistent improvement, which significantly outperforms some existing state-of-the-art models.

论文关键词:Heterogeneous graph neural network,Emotional conversation generation,Multi-source knowledge

论文评审过程:Received 2 April 2021, Revised 14 February 2022, Accepted 28 March 2022, Available online 1 April 2022, Version of Record 6 April 2022.

论文官网地址:https://doi.org/10.1016/j.artint.2022.103714