Mining frequent itemsets in large databases: The hierarchical partitioning approach

作者：

Highlights：

•

摘要

Although many methods have been proposed to enhance the efficiencies of data mining, little research has been devoted to the issue of scalability – that is, the problem of mining frequent itemsets when the size of the database is very large. This study proposes a methodology, hierarchical partitioning, for mining frequent itemsets in large databases, based on a novel data structure called the Frequent Pattern List (FPL). One of the major features of the FPL is its ability to partition the database, and thus transform the database into a set of sub-databases of manageable sizes. As a result, a divide-and-conquer approach can be developed to perform the desired data-mining tasks. Experimental results show that hierarchical partitioning is capable of mining frequent itemsets and frequent closed itemsets in very large databases.

论文关键词：Data mining,Frequent itemsets,Frequent closed itemsets,Frequent Pattern List (FPL),Hierarchical partitioning

论文评审过程：Available online 20 September 2012.

论文官网地址：https://doi.org/10.1016/j.eswa.2012.09.005