Reducing partition skew on MapReduce: an incremental allocation approach

作者:Zhuo Wang, Qun Chen, Bo Suo, Wei Pan, Zhanhuai Li

摘要

MapReduce, a parallel computational model, has been widely used in processing big data in a distributed cluster. Consisting of alternate map and reduce phases, MapReduce has to shuffle the intermediate data generated by mappers to reducers. The key challenge of ensuring balanced workload on MapReduce is to reduce partition skew among reducers without detailed distribution information on mapped data.

论文关键词:incremental partitioning, data balance, MapReduce

论文评审过程:

论文官网地址:https://doi.org/10.1007/s11704-018-6586-2