Information fusion for text classification — an experimental comparison
作者:
Highlights:
•
摘要
This article reports on our experiments and results on the effectiveness of different feature sets and information fusion from some combinations of them in classifying free text documents into a given number of categories. We use different feature sets and integrate neural network learning into the method. The feature sets are based on the “latent semantics” of a reference library — a collection of documents adequately representing the desired concepts. We found that a larger reference library is not necessarily better. Information fusion almost always gives better results than the individual constituent feature sets, with certain combinations doing better than the others.
论文关键词:Text classification,Features,Latent semantic indexing,Reference library,Neural networks,Information fusion
论文评审过程:Received 26 March 1998, Revised 8 November 2000, Accepted 25 September 2001, Available online 30 August 2001.
论文官网地址:https://doi.org/10.1016/S0031-3203(00)00171-0