On mining approximate and exact fault-tolerant frequent itemsets

作者:Shengxin Liu, Chung Keung Poon

摘要

Robust frequent itemset mining has attracted much attention due to the necessity to find frequent patterns from noisy data in many applications. In this paper, we focus on a variant of robust frequent itemsets in which a small amount of “faults” is allowed in each item and each supporting transaction. This problem is challenging since computing fault-tolerant support count is NP-hard and the anti-monotone property does not hold when the amount of allowable faults is proportional to the size of the itemset. We develop heuristic methods to solve an approximation version of the problem and propose speedup techniques for the exact problem. Experimental results show that our heuristic algorithms are substantially faster than the state-of-the-art exact algorithms while the error is acceptable. In addition, the proposed speedup techniques substantially improve the efficiency of the exact algorithms.

论文关键词:Data mining, Mining methods and algorithms, Frequent itemsets, Fault tolerance, Approximate support count

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10115-017-1079-4