SUM-optimal histograms for approximate query processing

作者:Meifan Zhang, Hongzhi Wang, Jianzhong Li, Hong Gao

摘要

In this paper, we study the problem of the SUM query approximation with histograms. We define a new kind of histogram called the SUM-optimal histogram which can provide better estimation result for the SUM queries than the traditional equi-depth and V-optimal histograms. We propose three methods for the histogram construction. The first one is a dynamic programming method, and the other two are approximate methods. We use a greedy strategy to insert separators into a histogram and use the stochastic gradient descent method to improve the accuracy of separators. The experimental results indicate that our method can provide better estimations for the SUM queries than the equi-depth and V-optimal histograms.

论文关键词:Approximate query processing, Histogram, Big data

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10115-020-01450-7