Mining globally interesting patterns from multiple databases using kernel estimation

作者:

Highlights:

摘要

When extracting knowledge (or patterns) from multiple databases, the data from different databases might be too large in volume to be merged into one database for centralized mining on one computer, the local information sources might be hidden from a global decision maker due to privacy concerns, and different local databases may have different contribution to the global pattern. Dealing with multiple databases is essentially different from mining from a single database. In multi-database mining, the global patterns must be obtained by carefully analyzing the local patterns from individual databases. In this paper, we propose a nonlinear method, named KEMGP (kernel estimation for mining global patterns), to tackle this problem, which adopts kernel estimation to synthesizing local patterns for global patterns. We also adopt a method to divide all the data in different databases according to attribute dimensionality, which reduces the total space complexity. We test our algorithm on a customer management system, where the application is to obtain all globally interesting patterns by analyzing the individual databases. The experimental results show that our method is efficient.

论文关键词:Multiple database mining,Global pattern,Multiple data source discovery

论文评审过程:Available online 29 January 2009.

论文官网地址:https://doi.org/10.1016/j.eswa.2009.01.030