On acquiring classification knowledge from noisy data based on rough set
作者:
Highlights:
•
摘要
Induction of classification rules based on rough set theory has been an active research area in the field of machine learning. However, pure rough set theory is not well suited for analyzing noisy information systems. This paper adopts a generalization of rough set model based on fuzzy lower approximation with respect to information granules. Based on the fuzzy lower approximation, a concept of tolerant approximation is introduced to deal with the problem of discovering effective rules from noisy data. An efficient rule induction algorithm based on the tolerant lower approximation is proposed, and two heuristics are investigated to study their inductive effectiveness. Empirical experiments are conducted on five real-life data sets, acknowledged in the machine learning community, using the algorithms. The Tree classification algorithm from the IBM Intelligent Miner is also investigated as a comparison basis. Effectiveness measurements include the prediction accuracy, cost ratio and the rule validation rate based on randomization analysis. The empirical evidences show that the proposed algorithm is effective in dealing with rule induction in noisy environments.
论文关键词:Classification,Rough set,Noisy information system,Lower approximation,Information granule,Randomization analysis
论文评审过程:Available online 15 February 2005.
论文官网地址:https://doi.org/10.1016/j.eswa.2005.01.005