Optimal cost-sensitive granularization based on rough sets for variable costs

作者:

Highlights:

摘要

In real application domains, acquiring fine-grained data has a higher cost than coarse-grained data. To achieve the best results at the lowest cost, it is necessary to select an optimal granularization for different data precisions. However, existing work in cost-sensitive learning focuses mainly on a fixed granularization, while related test costs and misclassification costs are also fixed. In this paper, we propose an optimal cost-sensitive granularization based on rough sets for variable costs. The major contributions of this paper are threefold. First, different granularizations are used with confidence levels correlated to the precision of the data. In this context, we build a confidence-level-based covering rough set model. The test cost is then represented by a function of each feature and a confidence level, and misclassification costs are computed according to test costs. The variable cost settings are more reasonable for representing the relationship between the granularizations and costs than previous approaches. Finally, a granularization approach is proposed to obtain a trade-off between the different data precisions and variable costs. The experimental results show that our approach satisfactorily handles data with different precisions under different cost settings. In addition, the optimal granularization is adaptive to the data involved rather than being fixed, and is thus more efficient and more versatile than the existing fixed granularization.

论文关键词:Granularization,Neighborhood systems,Rough sets,Variable costs,Cost-sensitive learning

论文评审过程:Received 5 August 2013, Revised 31 March 2014, Accepted 5 April 2014, Available online 24 April 2014.

论文官网地址:https://doi.org/10.1016/j.knosys.2014.04.009