Cost models for distance joins queries using R-trees
作者:
Highlights:
•
摘要
The K-Closest-Pairs Query (K-CPQ), a type of distance join in spatial databases, discovers the K pairs of objects formed from two different datasets with the K smallest distances. Recently, branch-and-bound algorithms based on R-trees have been developed in order to answer K-CPQs efficiently. For query optimization purposes, analytical models are needed to estimate the processing cost of a specific query in order to evaluate alternative execution plans. In this paper, we combine techniques that have been used for the analysis of nearest neighbor and spatial join queries, and derive the performance cost (in terms of disk accesses) of K-CPQs using R-trees. Moreover, we present two interesting extensions of the cost model for K-CPQs, one exploiting the buffering management using R-trees and another for a second type of distance join, the so-called buffer queries. The proposed cost models are verified under a variety of distributions in 2-dimensional space on both synthetic and real datasets, shown to achieve accurate estimations of the measured experimental results.
论文关键词:Cost models,Distance join queries,R-trees,Buffer model,Performance analysis
论文评审过程:Received 25 September 2004, Revised 3 February 2005, Accepted 24 March 2005, Available online 26 April 2005.
论文官网地址:https://doi.org/10.1016/j.datak.2005.03.004