Mining evolving data streams for frequent patterns

作者:

Highlights:

摘要

A data stream is a potentially uninterrupted flow of data. Mining this flow makes it necessary to cope with uncertainty, as only a part of the stream can be stored. In this paper, we evaluate a statistical technique which biases the estimation of the support of patterns, so as to maximize either the precision or the recall, as chosen by the user, and limit the degradation of the other criterion. Theoretical results show that the technique is not far from the optimum, from the statistical standpoint. Experiments performed tend to demonstrate its potential, as it remains robust even under significant distribution drifts.

论文关键词:Data streams,Concentration inequalities,Precision,Recall,Accuracy

论文评审过程:Received 7 November 2005, Accepted 13 March 2006, Available online 5 June 2006.

论文官网地址:https://doi.org/10.1016/j.patcog.2006.03.006