Thematic ranking of object summaries for keyword search

作者：

Highlights：

•

摘要

An Object Summary (OS) is a tree structure of tuples that summarizes the context of a particular Data Subject (DS) tuple. The OS has been used as a model of keyword search in relational databases; where given a set of keywords, the objective is to identify the DSs tuples relevant to the keywords and their corresponding OSs. However, a query result may return a large amount of OSs, which brings in the issue of effectively and efficiently ranking them in order to present only the most important ones to the user.In this paper, we propose a model that ranks OSs containing a set of identifying keywords (e.g., Chen) according to their relevance to a set of thematic keywords (e.g. Mining). We argue that the effective thematic ranking of OSs should combine gracefully IR-style properties, authoritative ranking and affinity. Our ranking problem is modeled and solved as a top-k group-by join; we propose an algorithm that computes the join efficiently, taking advantage of appropriate count statistics and compare it with baseline approaches. An experimental evaluation on the DBLP and TPC-H databases verifies the effectiveness and efficiency of our proposal.

论文关键词：Keyword search,Object summaries,Top-k Queries,Relational databases

论文评审过程：Received 18 September 2016, Revised 24 June 2017, Accepted 18 August 2017, Available online 28 October 2017, Version of Record 5 February 2018.

论文官网地址：https://doi.org/10.1016/j.datak.2017.08.002