Data-driven memory management for stream join

作者:

Highlights:

摘要

Memory management is a critical issue in stream processing involving stateful operators such as join. Traditionally, the memory requirement for a stream join is query-driven: a query has to explicitly define a window for each (potentially unbounded) input. The window essentially bounds the size of the buffer allocated for that stream. However, output produced this way may not be desirable (if the window size is not part of the intended query semantic) due to the volatile input characteristics. We discover that when streams are ordered or partially ordered, it is possible to use a data-driven memory management scheme to improve the performance. In this work, we present a novel data-driven memory management scheme, called Window-Oblivious Join (WO-Join), which adaptively adjusts the state buffer size according to the input characteristics. Our performance study shows that, compared to traditional Window-Join (W-Join), WO-Join is more robust with respect to the dynamic input and therefore produces higher quality results with lower memory costs.

论文关键词:Data stream,Stream join,Data-driven memory management

论文评审过程:Received 29 January 2009, Accepted 2 February 2009, Available online 12 February 2009.

论文官网地址:https://doi.org/10.1016/j.is.2009.02.001