Pattern discovery via constraint programming

作者:

Highlights:

摘要

Pattern discovery is one of the most fundamental problems in data mining. Various patterns with their discovering algorithms are proposed in different applications and domains. There is still a great demand for defining new meaningful patterns with new requirements since every application has its unique characteristics. Existing studies propose new query languages to describe these ad-hoc patterns. However, most of them focus on small variations of frequent item sets and association rules. Many meaningful patterns in other domains, such as temporal and spatial patterns, are not covered. This paper proposes a constraint based view for pattern discovery without introducing new languages, where the patterns are described by a collection of constraints given at run time. In this view, a pattern discovery problem is seen as a constraint satisfaction problem. This view provides a general framework for universal pattern discovery. Many previously known patterns can be regarded as different variations derived from this general framework with different constraints. Two generic algorithms are proposed for solving the constraint satisfaction problem. Empirical evaluation on two well-studied patterns shows that (1) the time cost of one generic algorithm is close to that of those specialized mining algorithms, and (2) the space cost of the generic algorithm increases linearly according to the input data volume. Two other case studies also demonstrate the effectiveness of this constraint based view for solving new problems in new scenarios.

论文关键词:Temporal dependency,Lag interval,Event mining,H.2.8 [Database Management],Database Application-Data Mining

论文评审过程:Received 17 April 2015, Revised 23 October 2015, Accepted 30 October 2015, Available online 2 December 2015, Version of Record 7 January 2016.

论文官网地址:https://doi.org/10.1016/j.knosys.2015.10.031