Mining decision rules on data streams in the presence of concept drifts

作者:

Highlights:

摘要

In a database, the concept of an example might change along with time, which is known as concept drift. When the concept drift occurs, the classification model built by using the old dataset is not suitable for predicting a new dataset. Therefore, the problem of concept drift has attracted a lot of attention in recent years. Although many algorithms have been proposed to solve this problem, they have not been able to provide users with a satisfactory solution to concept drift. That is, the current research about concept drift focuses only on updating the classification model. However, real life decision makers might be very interested in the rules of concept drift. For example, doctors desire to know the root causes behind variation in the causes and development of disease. In this paper, we propose a concept drift rule mining tree, called CDR-Tree, to accurately discover the underlying rule governing concept drift. The main contributions of this paper are: (a) we address the problem of mining concept-drifting rules which has not been considered in previously developed classification schemes; (b) we develop a method that can accurately mine rules governing concept drift; (c) we develop a method that should classification models be required, can efficiently and accurately generate such models via a simple extraction procedure rather than constructing them anew; and (d) we propose two strategies to reduce the complexity of concept-drifting rules mined by our CDR-Tree.

论文关键词:Data mining,Classification,Decision tree,Data stream,Concept drift

论文评审过程:Available online 8 December 2007.

论文官网地址:https://doi.org/10.1016/j.eswa.2007.11.034