Deriving orthogonality to optimize the search for summary data

作者:

Highlights:

摘要

An effective optimization strategy for evaluating statistical queries is to use precomputed summary data on certain categories. An important step in this strategy is to compare categories for containment in order to decide whether the summary data on one category can be used to compute the summary on another. This paper studies optimization for such comparisons. A category in this paper is represented by a relation whose attributes are partitioned into pair-wise disjoint sets, each called a dimension. A category is said to be orthogonal if it is equal to the cross product of the projections of itself on all the dimensions, and k-partially orthogonal if it is the union of k orthogonal ones. Comparing k-partially orthogonal categories for containment is computationally much easier than comparing arbitrary categories if k is small and all the orthogonal subcategories are known. It is shown however that it is computationally intractable (NP-hard) to partition an arbitrarily given category into the smallest number of orthogonal subcategories. In order to avoid this intractable task but still take advantage of orthogonality, this paper investigates methods that derive orthogonality in categories which are results of relational queries, assuming the orthogonality in input categories is known. The methods are based on a careful examination of each relational operation and on certain auxiliary constructs for labelling orthogonal subcategories.

论文关键词:Statistical Database,On-Line Analytical Processing (OLAP),Summary Data,Query Optimization,Orthogonality,Relational Query

论文评审过程:Received 10 May 1996, Revised 11 November 1998, Available online 2 June 1999.

论文官网地址:https://doi.org/10.1016/S0306-4379(99)00004-6