A unifying analysis for the supervised descriptive rule discovery via the weighted relative accuracy

作者：

Highlights：

•

摘要

Supervised descriptive rule discovery represents a set of data mining techniques whose objective is to describe data with respect to a property of interest. This concept encompasses different techniques such as subgroup discovery, emerging patterns and contrast sets. Supervised learning is used to obtain rules for descriptive purposes but with different quality measures. Although their origin is based on different data mining tasks, our hypothesis is about the existence of a compatibility between subgroup discovery, emerging patterns and contrast sets thanks to the common use of a weighted relative accuracy quality measure. A complete analysis shows this relationship between the different tasks. The analysis is supported by an empirical study with the most representative algorithms for each technique.The paper shows how the use of the weighted relative accuracy allows the experts to distinguish between interesting subgroups, emerging and/or contrasting rules thanks to the relation between the quality measures employed in the search process for different models. In addition, this relationship enables us to analyse the main differences and/or similarities between the different techniques within supervised descriptive rule discovery. This scenario opens up new challenges for the supervised descriptive rule learning models in analysing and developing descriptive models with a new perspective.

论文关键词：Supervised descriptive rule discovery,Subgroup discovery,Emerging patterns,Contrast sets,Weighted relative accuracy

论文评审过程：Received 1 February 2017, Revised 11 October 2017, Accepted 12 October 2017, Available online 13 October 2017, Version of Record 13 November 2017.

论文官网地址：https://doi.org/10.1016/j.knosys.2017.10.015