TFM: A Triple Fusion Module for Integrating Lexicon Information in Chinese Named Entity Recognition
作者:Haitao Liu, Jihua Song, Weiming Peng, Jingbo Sun, Xianwei Xin
摘要
Due to the characteristics of the Chinese writing system, character-based Chinese named entity recognition models ignore the word information in sentences, which harms their performance. Recently, many works try to alleviate the problem by integrating lexicon information into character-based models. These models, however, either simply concatenate word embeddings, or have complex structures which lead to low efficiency. Furthermore, word information is viewed as the only resource from lexicon, thus the value of lexicon is not fully explored. In this work, we observe another neglected information, i.e., character position in a word, which is beneficial for identifying character meanings. To fuse character, word and character position information, we modify the key-value memory network and propose a triple fusion module, termed as TFM. TFM is not limited to simple concatenation or suffers from complicated computation, compatibly working with the general sequence labeling model. Experimental evaluations show that our model has performance superiority. The F1-scores on Resume, Weibo and MSRA are 96.19%, 71.12% and 95.63% respectively.
论文关键词:Chinese named entity recognition, Lexicon information, Information fusion, Natural language processing
论文评审过程:
论文官网地址:https://doi.org/10.1007/s11063-022-10768-y