Estimating null values in relational database systems using automatic clustering and multiple regression techniques

作者:

Highlights:

摘要

In this paper, we present a new method for estimating null values in relational database systems using automatic clustering and multiple regression techniques. First, we present a new automatic clustering algorithm for clustering numerical data. The proposed automatic clustering algorithm does not need to determine the number of clusters in advance and does not need to sort the data in the database in advance. Then, based on the proposed automatic clustering algorithm and multiple regression techniques, we present a new method to estimate null values in relational database systems. The proposed method estimating null values in relational database systems only needs to process a particular cluster instead of the whole database. It gets a higher average estimation accuracy rate than the existing methods for estimating null values in relational database systems.

论文关键词:Relational database,Null value,Automatic clustering algorithm,Cluster center

论文评审过程:Available online 21 November 2007.

论文官网地址:https://doi.org/10.1016/j.eswa.2007.10.046