Density-based data partitioning strategy to approximate large-scale subgraph mining

作者:

Highlights:

摘要

Recently, graph mining approaches have become very popular, especially in certain domains such as bioinformatics, chemoinformatics and social networks. One of the most challenging tasks is frequent subgraph discovery. This task has been highly motivated by the tremendously increasing size of existing graph databases. Due to this fact, there is an urgent need of efficient and scaling approaches for frequent subgraph discovery. In this paper, we propose a novel approach for large-scale subgraph mining by means of a density-based partitioning technique, using the MapReduce framework. Our partitioning aims to balance computational load on a collection of machines. We experimentally show that our approach decreases significantly the execution time and scales the subgraph discovery process to large graph databases.

论文关键词:Frequent subgraph mining,Graph partitioning,Graph density,MapReduce,Cloud computing

论文评审过程:Available online 14 September 2013.

论文官网地址:https://doi.org/10.1016/j.is.2013.08.005