Holistic Join for Generalized Tree Patterns

作者:

Highlights:

摘要

We consider the problem of evaluating an XQuery query Q (involving only child and descendant axes) on an XML document D. D is stored on a disk and is read from there, in document order. Chen et al. [From Tree Patterns to Generalized Tree Patterns: on efficient evaluation of XQuery, Proceedings of International Conference on Very Large Data Bases (VLDB), 2003, pp. 237–248] presented an algorithm to convert Q (from a large fragment of XQuery) into a Generalized Tree Pattern GTP(Q), and a set J(Q) of value join conditions on its vertices. Evaluating Q on D reduces to finding the matches for GTP(Q) in D. We present an efficient algorithm for finding these matches. Excluding the computation of the value joins J(Q), our algorithm performs two linear passes over the data, and runs in O(d|Q|) memory space, where d denotes the depth of D; runtime and disk I/O are O(|Q∥D|). If separate input streams of document nodes for the individual vertices in GTP(Q) are available, our runtime and disk I/O are linear in the input size; this runtime and disk I/O are trivially optimal.

论文关键词:XML,XPath,XQuery,Generalized Tree Patterns,Query evaluation

论文评审过程:Received 28 June 2005, Revised 19 October 2006, Accepted 24 October 2006, Available online 18 December 2006.

论文官网地址:https://doi.org/10.1016/j.is.2006.10.008