Cost-sensitive rough set: A multi-granulation approach

作者:

Highlights:

摘要

Cost is an important issue in real world data mining. In rough set community, test cost and decision cost are two popular costs which are addressed by many researchers. In recent years, these two costs have been widely discussed from the standpoint of attribute reduction. However, few works pay attention to the construction of cost-sensitive rough set model. In addition, it becomes apparent that multiple granulation approach plays a crucial role in dealing with involute information, such as heterogeneous data and multi-scale data. This study elaborates on a novel design of cost-sensitive rough set model with the aid of multi-granulation strategy. First, the lower and upper approximations of cost-sensitive multi-granulation are constructed, and it can be verified that in multi-granulation framework, the information granules and approximations are sensitive to decision costs and test costs, respectively. Second, along the approximations definitions, a semantic interpretation of the proposed model is studied. According to this interpretation, the settings of decision cost and test cost are presented in light of information entropy. For information reduction, we transform it to an optimization problem and investigate two pivotal reduction criteria. Theoretical analysis and experimental results show that: (a) the established model is a generalization of many existing models and quite close to real applications; (b) entropy based cost setting is much suitable for our model since it can increase classification quality or decrease decision cost; (c) considering different reduction approaches, decision monotonicity based reduction can increase the numbers of certainty rules and decrease the numbers of uncertainty rules while cost based reduction can obtain the minimal total costs. This study also shows an important philosophy in our life, i.e., the more you pay, the more you gain.

论文关键词:Cost-sensitive learning,Decision cost,Multi-granulation,Rough set,Test cost

论文评审过程:Received 30 October 2016, Revised 1 February 2017, Accepted 3 February 2017, Available online 14 February 2017, Version of Record 27 March 2017.

论文官网地址:https://doi.org/10.1016/j.knosys.2017.02.019