Fuzzy unordered rule induction algorithm in text categorization on top of geometric particle swarm optimization term selection
作者:
Highlights:
•
摘要
Rapid growth of digital information requires automated handling and organization of documents. The two main stages in automated document categorization are (i) term reduction and (ii) classification. In this paper, we present a novel two-stage term reduction strategy based on Information Gain (IG) theory and Geometric Particle Swarm Optimization (GPSO) search. We evaluate performance of the proposed term reduction approach with use of a new classifier, fuzzy unordered rule induction algorithm (FURIA) to categorize multi-label texts. In order to evaluate the performance of FURIA quantitatively, we compared it against two widely used algorithms, Naïve Bayes and Support Vector Machine (SVM). Text Categorization (TC) performance of the proposed term reduction strategy is validated with use of Reuters-21578 and OHSUMED text collection datasets. The experimental results show that performance of the proposed term reduction method is efficient for document organization tasks.
论文关键词:Fuzzy rule,Geometric particle swarm optimization,Text categorization,Information gain,Multi-label text categorization,Feature selection
论文评审过程:Received 8 January 2013, Revised 11 September 2013, Accepted 23 September 2013, Available online 6 October 2013.
论文官网地址:https://doi.org/10.1016/j.knosys.2013.09.020