A multi-label text classification method via dynamic semantic representation model and deep neural network

作者:Tianshi Wang, Li Liu, Naiwen Liu, Huaxiang Zhang, Long Zhang, Shanshan Feng

摘要

The increment of new words and text categories requires more accurate and robust classification methods. In this paper, we propose a novel multi-label text classification method that combines dynamic semantic representation model and deep neural network (DSRM-DNN). DSRM-DNN first utilizes word embedding model and clustering algorithm to select semantic words. Then the selected words are designated as the elements of DSRM-DNN and quantified by the weighted combination of word attributes. Finally, we construct a text classifier by combining deep belief network and back-propagation neural network. During the classification process, the low-frequency words and new words are re-expressed by the existing semantic words under sparse constraint. We evaluate the performance of DSRM-DNN on RCV1-v2, Reuters-21578, EUR-Lex, and Bookmarks. Experimental results show that our method outperforms the state-of-the-art methods.

论文关键词:Text classification, Word embedding, Clustering, Sparse representation, Neural network

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10489-020-01680-w