Self attention mechanism of bidirectional information enhancement

作者:Qibin Li, Nianmin Yao, Jian Zhao, Yanan Zhang

摘要

Self attention mechanism is widely used in relation extraction, emotion classification and other tasks. It can extract a wide range of relevance information in the text. The attention mode of the existing self attention mechanism is soft attention mode, that is, a dense attention matrix is generated by softmax function. However, if the sentence length is long, the weight of important information will be too small. At the same time, the softmax function assumes that all elements have a positive impact on the results by default, which makes the model unable to extract the negative effect information. We use hard attention mechanism, namely sparse attention matrix, to improve the existing self attention model and fully extract the positive and negative information of text. Our model can not only enhance the extraction of positive information, but also makes up for the blank that the traditional attention matrix cannot be negative. We evaluated our model in three tasks and seven data sets. The experimental results show that our model is superior to the traditional self attention model and superior to state-of-the-art models in some tasks.

论文关键词:Bi-directional information, Hard attention, Self attention

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10489-021-02492-2