Stream mining on univariate uncertain data

作者:Ying-Ho Liu

摘要

In this paper, we propose mining frequent patterns from univariate uncertain data streams, which have a quantitative interval for each attribute in a transaction and a probability density function indicating the possibilities that the values in the interval appear. Many data streams comprise flows of univariate uncertain data, for example, the records of atmospheric pollution sensors, and network monitoring records. We propose two algorithms to address this issue: the ExactU2Stream algorithm and the ApproxiU2Stream algorithm. The former incrementally stores the incoming transactions, and delays the mining process until it is requested. The latter mines the transactions immediately when they arrive, and stores the derived frequent patterns. Compared with the latter, the former returns results that are more accurate, but it also requires more response time. Both algorithms utilize the sliding window scheme, which decomposes the continuous data stream into discrete, overlapping chunks. The proposed algorithms outperform the compared methods in terms of runtime and memory usage. We have applied the two proposed algorithms to the data streams recording the air quality in Taiwan; the derived frequent patterns not only show the common air quality in Taiwan but also show the extremely bad air quality when a sand storm affects Taiwan.

论文关键词:Data stream mining, Frequent U2 patterns, Univariate uncertain data, ExactU2Stream, ApproxiU2Stream

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10489-012-0415-3