Deep model with neighborhood-awareness for text tagging
作者:
Highlights:
•
摘要
In recent years, many efforts based on deep learning have been made to address the issue of text tagging. However, these work generally neglect to consider the neighborhood effect which may help improve the accuracy of predictions. For this, we present a neighborhood-aware deep model for text tagging (NATT). A neural component which combines bi-directional recurrent neural network and self-attention mechanism, is firstly selected as the text encoder to encode the target document into one feature vector. Then, k-nearest-neighbor documents of the target document are identified and encoded into feature vectors one by one with the same text encoder. Simultaneously, an independent attention module is introduced to aggregate these neighboring documents into a special feature vector, which will represent features of the neighborhood. Finally, the two feature vectors are fused to match the embedding vectors of tags. To optimize the NATT model, we build the objective function with pairwise hinge loss and specially develop a neighborhood-aware negative sampling strategy to form training data. Experimental results on four datasets demonstrate that NATT outperforms some state-of-the-art neural models. Additionally, NATT is economical on achieving the best results with less training epochs and a smaller number of nearest neighbors.
论文关键词:Neighborhood-aware,Negative sampling,Deep neural networks,Text tagging
论文评审过程:Received 12 December 2019, Revised 5 March 2020, Accepted 7 March 2020, Available online 10 March 2020, Version of Record 16 April 2020.
论文官网地址:https://doi.org/10.1016/j.knosys.2020.105750