Ranking the sky: Discovering the importance of skyline points through subspace dominance relationships

作者:

Highlights:

摘要

Skyline queries aim to help users make intelligent decisions over complex data by discovering a set of interesting points, when different and often conflicting criteria are considered. Unfortunately, as the dimensionality of the dataset grows, the skyline operator loses its discriminating power and returns a large fraction of the data. The huge size of the result set hinders decision-making and motivates the ranking of skyline points. Therefore, users prefer to retrieve the top-k skyline points instead of the whole skyline set. In this paper, we propose SKYRANK, a framework for ranking the skyline points in the absence of a user-defined preference function, thereby discovering a limited subset of the most interesting points of the skyline set. For this purpose, we define the skyline graph, which relies on the dominance relationships between the skyline points for different subsets of dimensions (subspaces). SKYRANK applies well-known authority-based ranking algorithms on the skyline graph and, as described in this paper, discovers the importance of a skyline point exploiting the subspace dominance relationships. Furthermore, we extend SKYRANK to handle top-k preference skyline queries, when the user's preferences are available. Our experimental evaluation illustrates the complexity of the dominance relationships and the ranking ability of our framework.

论文关键词:Skyline operator,Top-k skyline queries,PageRank,Dominance relationship

论文评审过程:Received 18 November 2008, Revised 16 March 2010, Accepted 16 March 2010, Available online 31 March 2010.

论文官网地址:https://doi.org/10.1016/j.datak.2010.03.008