Data locality optimization based on data migration and hotspots prediction in geo-distributed cloud environment

作者：

Highlights：

•

摘要

With the explosive growth of data-intensive mobile, social, commercial and industrial applications, geo-distributed cloud becomes the main trend of cloud computing due to its advantages of higher flexible scalability, stronger stability, lower latency, and more diverse services. Due to the limited network bandwidth, communication across geographic data centers typically suffers from wide-area latencies, which significantly deteriorates system performance. Data locality is an effective way to solve this problem. In order to provide flexible cloud computing services for data-intensive applications, combining with the advantage of geo-distributed cloud computing paradigm, this paper proposed a data locality optimization method based on data migration (DLO-Migrate) and a data locality optimization algorithm based on hotspots prediction (DLO-Predict) to reduce data access delay in geo-distributed cloud environment. In DLO-Migrate method, tasks are assigned according to node locality, and access data of non-node-locality tasks are migrated in advance by using the idle network bandwidth. In DLO-Predict algorithm, from cloud-level data locality perspective, hot files are predicted and synchronized periodically among data centers of the geo-distributed cloud during information interaction. Extensive experimental results show that, compared with baseline algorithms, our proposed algorithms can improve data locality of geo-distributed cloud and reduce job completion time substantially.

论文关键词：Geo-distributed cloud,Data locality,Data migration,Hotspots prediction

论文评审过程：Received 20 April 2018, Revised 29 November 2018, Accepted 3 December 2018, Available online 17 December 2018, Version of Record 7 January 2019.

论文官网地址：https://doi.org/10.1016/j.knosys.2018.12.002