Mining frequent tree-like patterns in large datasets
作者:
Highlights:
•
摘要
Sequential pattern mining is crucial to data mining domains. This paper proposes a novel data mining approach for exploring hierarchical tree structures, named tree-like patterns, representing the relationships for a pair of items in a sequence. Using tree-like patterns, the relationships for a pair of items can be identified in terms of the cause and effect. A novel technique that efficiently counts support values for tree-like patterns using a queue structure is proposed. In addition, this paper addresses an efficient scheme for determining the frequency of a tree-like pattern in a sequence using a dynamic programming approach. Each tree-like pattern embedded in a sequence is considered to have a certain valuable meaning or the degree of importance used in different applications. Two addressed formulas are applied to determine the degree of significance for a specific sequence, which denotes the degree of consecutive items in a tree-like pattern for a sequence. The larger the degree of significance a tree-like pattern has, the more the tree-like pattern is compacted in the sequence. The characteristics differentiating the explored patterns from those obtained with other schemes are discussed. A simulation analysis of the proposed data mining approach is utilized to demonstrate its efficacy. Finally, the proposed approach is designed and implemented in a data mining system integrated into a novel e-learning platform.
论文关键词:Data mining,Frequent patterns,Sequential patterns,Tree-like patterns,World wide web
论文评审过程:Received 4 September 2004, Accepted 6 July 2006, Available online 7 August 2006.
论文官网地址:https://doi.org/10.1016/j.datak.2006.07.003