Repairing inconsistent dimensions in data warehouses

作者:

Highlights:

摘要

A dimension in a data warehouse (DW) is a set of elements connected by a hierarchical relationship. The elements are used to view summaries of data at different levels of abstraction. In order to support an efficient processing of such summaries, a dimension is usually required to satisfy different classes of integrity constraints. In scenarios where the constraints properly capture the semantics of the DW data, but they are not satisfied by the dimension, the problem of repairing (correcting) the dimension arises. In this paper, we study the problem of repairing a dimension in the context of two main classes of integrity constraints: strictness and covering constraints. We introduce the notion of minimal repair of a dimension: a new dimension that is consistent with respect to the set of integrity constraints, which is obtained by applying a minimal number of updates to the original dimension. We study the complexity of obtaining minimal repairs, and show how they can be characterized using Datalog programs with weak constraints under the stable model semantics.

论文关键词:Data warehouses,Dimensions,Integrity constraints,Inconsistency,Datalog programs,Stable models

论文评审过程:Received 5 June 2010, Revised 9 April 2012, Accepted 13 April 2012, Available online 4 July 2012.

论文官网地址:https://doi.org/10.1016/j.datak.2012.04.002