On the complexity of Newmanʼs community finding approach for biological and social networks

作者:

Highlights:

摘要

Given a graph of interactions, a module (also called a community or cluster) is a subset of nodes whose fitness is a function of the statistical significance of the pairwise interactions of nodes in the module. The topic of this paper is a model-based community finding approach, commonly referred to as modularity clustering, that was originally proposed by Newman (Leicht and Newman, 2008 [25]) and has subsequently been extremely popular in practice (e.g., see Agarwal and Kempe, 2008 [1], Guimer‘a et al., 2007 [20], Newman, 2006 [28], Newman and Girvan, 2004 [30], Ravasz et al., 2002 [32]). Various heuristic methods are currently employed for finding the optimal solution. However, as observed in Agarwal and Kempe (2008) [1], the exact computational complexity of this approach is still largely unknown. To this end, we initiate a systematic study of the computational complexity of modularity clustering. Due to the specific quadratic nature of the modularity function, it is necessary to study its value on sparse graphs and dense graphs separately. Our main results include a (1+ε)-inapproximability for dense graphs and a logarithmic approximation for sparse graphs. We make use of several combinatorial properties of modularity to get these results. These are the first non-trivial approximability results beyond the NP-hardness results in Brandes et al. (2007) [10].

论文关键词:Community detection,Modularity clustering,Approximation algorithms,Approximation hardness,Social networks,Biological networks

论文评审过程:Received 7 February 2011, Revised 6 November 2011, Accepted 10 April 2012, Available online 11 April 2012.

论文官网地址:https://doi.org/10.1016/j.jcss.2012.04.003