Extending attribute-oriented induction algorithm for major values and numeric values

作者:

Highlights:

摘要

Attribute-oriented induction (AOI) uses concept hierarchies to discover hidden patterns from a huge amount of data and presents the concise patterns as a general description of the original data. It is an effective data analysis and data reduction technique. Researchers have recently proposed some extensions of the original method. However, there are still problems. When an attribute has major values, the traditional approach cannot preserve and present these major patterns. In addition, the construction of concept hierarchies for numeric attributes is sometimes subjective, and the generalization of border values near the cutting points of discretization can easily result in misconception. This paper proposes an extended AOI, which generalizes the traditional approach by introducing an additional major values threshold and thereby preserves as well as presents major values. Moreover, we suggest an alternative for processing numeric attributes: computing and presenting the average and deviation of aggregated tuples, which avoids constructing subjectively a numeric concept hierarchy and the generalization of border values. A synthetic data set and a real data set are used for experiments and the results show that the proposed methods are feasible and can induce more precise rules out of the raw data.

论文关键词:Attribute-oriented induction,Data mining,Pattern extraction,Class description,Concept hierarchy

论文评审过程:Available online 10 February 2004.

论文官网地址:https://doi.org/10.1016/j.eswa.2004.01.002