Cost-based vectorization of instance-based integration processes

作者:

Highlights:

摘要

Integration processes are workflow-based integration tasks. The inefficiency of these processes is often caused by low resource utilization and significant waiting times for external systems. With the aim to overcome these problems, we proposed the concept of process vectorization. There, instance-based integration processes are transparently executed with the pipes-and-filters execution model. The term vectorization is used in the sense of processing a sequence (vector) of messages by one standing process. Although it has been shown that process vectorization achieves a significant throughput improvement, this concept has two major drawbacks. First, the theoretical performance of a vectorized integration process mainly depends on the performance of the most cost-intensive operator. Second, the practical performance strongly depends on the number of used threads and thus, on the number of operators. In this paper, we present an advanced optimization approach that addresses the mentioned problems. We generalize the vectorization problem and explain how to vectorize process plans in a cost-based manner taking into account the cost of the single operators in the form of their execution time. Due to the exponential time complexity of the exhaustive computation approach, we also provide a heuristic algorithm with linear time complexity. Furthermore, we explain how to apply the general cost-based vectorization to multiple process plans and we discuss the periodical re-optimization. In conclusion of our evaluation, the message throughput can be significantly increased compared to both the instance-based execution as well as the rule-based vectorized execution.

论文关键词:Cost-based vectorization,Operator-aware vectorization,Integration processes,Pipelining,Throughput optimization,Pipes and filters,Instance-based

论文评审过程:Available online 8 July 2010.

论文官网地址:https://doi.org/10.1016/j.is.2010.06.007