Interestingness measures for association rules based on statistical validity

作者:

Highlights:

摘要

Assessing rules with interestingness measures is the pillar of successful application of association rules discovery. However, association rules discovered are normally large in number, some of which are not considered as interesting or significant for the application at hand. In this paper, we present a systematic approach to ascertain the discovered rules, and provide a precise statistical approach supporting this framework. The proposed strategy combines data mining and statistical measurement techniques, including redundancy analysis, sampling and multivariate statistical analysis, to discard the non- significant rules. Moreover, we consider real world datasets which are characterized by the uniform and non-uniform data/items distribution with a mixture of measurement levels throughout the data/items. The proposed unified framework is applied on these datasets to demonstrate its effectiveness in discarding many of the redundant or non-significant rules, while still preserving the high accuracy of the rule set as a whole.

论文关键词:Data mining,Structured data,Interesting rules,Statistical analysis,Redundant rules,Interestingness measure

论文评审过程:Received 14 June 2010, Revised 6 October 2010, Accepted 22 November 2010, Available online 29 November 2010.

论文官网地址:https://doi.org/10.1016/j.knosys.2010.11.005