A high-performing comprehensive learning algorithm for text classification without pre-labeled training set

作者：Lizhen Liu, Qianhui Liang

摘要

In this paper, we investigate a comprehensive learning algorithm for text classification without pre-labeled training set based on incremental learning. In order to overcome the high cost in getting labeled training examples, this approach reforms fuzzy partition clustering to obtain a small quantity of labeled training data. Then the incremental learning of Bayesian classifier is applied. The model of the proposed classifier is composed of a Naïve-Bayes-based incremental learning algorithm and a modified fuzzy partition clustering method. For improved efficiency, a feature reduction is designed based on the Quadratic Entropy in Mutual Information. We perform experiments to demonstrate the performance of the approach, and the results show that our approach is feasible and effective.

论文关键词：Text classification, Clustering, Dimension reduction, Fuzzy clustering, Incremental learning

论文评审过程：

论文官网地址：https://doi.org/10.1007/s10115-011-0387-3