PPR-partitioning: a distributed graph partitioning algorithm based on the personalized PageRank vectors in vertex-centric systems

作者:Nasrin Mazaheri Soudani, Afsaneh Fatemi, Mohammadali Nematbakhsh

摘要

Relations among data items can be modeled with graphs in most of big data sets such as social networks’ data. This modeling creates big graphs with many vertices and edges. Balanced k-way graph partitioning is a common problem with big graphs. It has many applications in several fields. There are many approximate solutions for this problem; however, most of them do not have enough scalability for big graph partitioning and cannot be executed in a distributed manner. Vertex-centric model has been introduced recently as a scalable distributed processing method for big graphs. There are a few methods for graph partitioning based on this model. Existing approaches only consider one-step neighbors of vertices for graph partitioning and do not consider neighbors with higher steps. In this paper, a distributed method is introduced based on vertex-centric model for balanced k-way graph partitioning. This method applies the personalized PageRank vectors of vertices and partitions to decide how vertices are joined partitions. This method has been implemented in the Giraph system. The proposed method has been evaluated with several synthetic and real graphs. Experimental results have shown that this method has scalability for partitioning big graphs. It was also found that this method produces partitions with higher quality compared to the state-of-the-art stream-based methods and distributed methods based on vertex-centric programming model. Its result is close to the results of Metis method.

论文关键词:Graph partitioning, Big graphs, Personalized PageRank, Vertex-centric systems

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10115-019-01328-3