Information-preserving abstractions of event data in process mining
作者:Sander J. J. Leemans, Dirk Fahland
摘要
Process mining aims at obtaining information about processes by analysing their past executions in event logs, event streams, or databases. Discovering a process model from a finite amount of event data thereby has to correctly infer infinitely many unseen behaviours. Thereby, many process discovery techniques leverage abstractions on the finite event data to infer and preserve behavioural information of the underlying process. However, the fundamental information-preserving properties of these abstractions are not well understood yet. In this paper, we study the information-preserving properties of the “directly follows” abstraction and its limitations. We overcome these by proposing and studying two new abstractions which preserve even more information in the form of finite graphs. We then show how and characterize when process behaviour can be unambiguously recovered through characteristic footprints in these abstractions. Our characterization defines large classes of practically relevant processes covering various complex process patterns. We prove that the information and the footprints preserved in the abstractions suffice to unambiguously rediscover the exact process model from a finite event log. Furthermore, we show that all three abstractions are relevant in practice to infer process models from event logs and outline the implications on process mining techniques.
论文关键词:Process mining, Information preservation, Language abstraction, Model abstraction, Rediscoverability, Directly follows, Minimum self-distance, Inclusive choice
论文评审过程:
论文官网地址:https://doi.org/10.1007/s10115-019-01376-9