Shrink: An OLAP operation for balancing precision and size of pivot tables

作者:

Highlights:

摘要

Information flooding may occur during an OLAP session when the user drills down her cube up to a very fine-grained level, because the huge number of facts returned makes it very hard to analyze them using a pivot table. To overcome this problem we propose a novel OLAP operation, called shrink, aimed at balancing data precision with data size in cube visualization via pivot tables. The shrink operation fuses slices of similar data and replaces them with a single representative slice, respecting the constraints suggested by dimension hierarchies, until the result has either size or error smaller than a given threshold. An optimal computation of the shrink operation has exponential complexity, so we present both a greedy algorithm based on agglomerative clustering, which returns a sub-optimal solution, and a branch-and-bound algorithm that returns an optimal solution. Finally, we discuss some experimental results to evaluate the shrink operation from the efficiency and effectiveness point of view.

论文关键词:OLAP,Hierarchical clustering,Approximate query answering,OLAM

论文评审过程:Received 3 January 2014, Revised 7 April 2014, Accepted 3 July 2014, Available online 11 July 2014.

论文官网地址:https://doi.org/10.1016/j.datak.2014.07.004