Maximizing data locality in distributed systems

作者:

Highlights:

摘要

The effectiveness of a distributed system hinges on the manner in which tasks and data are assigned to the underlying system resources. Moreover, today's large-scale distributed systems must accommodate heterogeneity in both the offered load and in the makeup of the available storage and compute capacity. The ideal resource assignment must balance the utilization of the underlying system against the loss of locality incurred when individual tasks or data objects are fragmented among several servers. In this paper we describe this locality-maximizing placement problem and show that an optimal solution is NP-hard. We then describe a polynomial-time algorithm that generates a placement within an additive constant of two from optimal.

论文关键词:Bin packing,Distributed systems,Combinatorial algorithms,Approximation algorithms

论文评审过程:Received 21 May 2004, Revised 1 June 2006, Available online 24 August 2006.

论文官网地址:https://doi.org/10.1016/j.jcss.2006.07.001