Discovering rare correlated periodic patterns in multiple sequences

作者:

Highlights:

摘要

Periodic-Frequent Pattern Mining (PFPM) is an emerging problem, which consists of identifying frequent patterns that periodically occur over time in a sequence of events. Though PFPM is useful in many domains, traditional algorithms have two important limitations. First, they are not designed to find rare patterns. But discovering rare patterns is useful in many domains (e.g. to study rare diseases). Second, traditional PFPM algorithms are generally designed to find patterns in a single sequence, but identifying periodic patterns that are common to a set of sequences is often desirable (e.g. to find patterns common to several hospital patients or customers). To address these limitations, this paper proposes to discover a novel type of patterns in multiple sequences called Rare Correlated Periodic Patterns. Properties of the problem are studied, and an efficient algorithm named MRCPPS (Mining Rare Correlated Periodic Patterns common to multiple Sequences) is presented to efficiently find these patterns. It relies on a novel RCPPS-list structure to avoid repeatedly scanning the database. Experiments have been done on several real datasets, and it was observed that the proposed MRCPPS algorithm can efficiently discover all rare correlated periodic patterns common to multiple sequences, and filter many non rare and correlated patterns.

论文关键词:Periodic pattern,Rare pattern,Correlated pattern,Sequences

论文评审过程:Available online 30 August 2019, Version of Record 9 April 2020.

论文官网地址:https://doi.org/10.1016/j.datak.2019.101733