Discovering forward sequences from temporal data

作者:

Highlights:

摘要

Traditionally, sequence pattern mining has been used to mine items occurs in time sequences and items were deemed to be irrelevant to each other. However, in real applications, sequence items shown in a record may have some relation. For example, in mining students’ learning portfolios, the learning progressions must contain learning objects with forward-directed relations. Namely, the learning objects (items) themselves are evidence of a pre-existing relationship.In addition, most sequence mining algorithms assume the sequence records in databases are all of the same age. Each data record is observed at the same starting and ending point. But, according to the occurrences of events in a given period, the lengths of some sequences are longer than others. Hence the sequences with longer time spans might contain longer patterns than those with shorter time spans. As a result, the frequency of possible patterns shown in longer sequences might be underestimated due to fewer of records.This research proposed two methods, FSP and FSP-LC, to analyze forward sequence data. Latter, a real-life database which records all employee progression histories in a large company was used to verify and explain the proposed methods. The experimental results show that the proposed methods can mine specific and longer sequences to further improve and re-design their personnel systems.

论文关键词:Data mining,Frequent pattern,Sequence pattern,Temporal data,Forward sequence

论文评审过程:Received 7 March 2012, Revised 9 October 2012, Accepted 10 October 2012, Available online 30 October 2012.

论文官网地址:https://doi.org/10.1016/j.knosys.2012.10.007