Efficient mining of high utility pattern with considering of rarity and length

作者:Donggyu Kim, Unil Yun

摘要

Techniques for mining rare patterns have been researched in the association rule mining area because traditional frequent pattern mining methods have to generate a large amount of unnecessary patterns in order to find rare patterns from large databases. One such technique, the multiple minimum support threshold framework was devised to extract rare patterns by using a different minimum item support threshold for each item in a database. Nevertheless, this framework cannot sufficiently reflect environments of the real world. The reason is that it does not consider weights of items, such as market prices of products and fatality rates of diseases, in its mining process. Therefore, an algorithm has been proposed to mine rare patterns with utilities exceeding a user-specified minimum utility by considering rarity and utility information of items. However, since this algorithm employs the concept of traditional high utility pattern mining, patterns’ lengths are not considered for determining utilities of the patterns. If the length of a pattern is sufficiently long, the pattern is more likely to have an enough utility to become a high utility pattern regardless of item utilities within the pattern. Therefore, the algorithm cannot guarantee that all items in a mined pattern have high utilities. In this paper, we propose a novel algorithm that effectively reduces such dependency of patterns on their lengths by considering their lengths in the mining process in order to mine more meaningful rare patterns compared to patterns mined by previous algorithms. Experimental results demonstrate that our algorithm extracts a lesser number of more meaningful patterns and consumes less computational resources compared to state-of-the-art algorithms.

论文关键词:Data mining, Utility pattern mining, Multiple minimum support

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10489-015-0750-2