Privacy preserving release of blogosphere data in the presence of search engines

作者:

Highlights:

摘要

Users registered in a blogging platform and the subscriptions among them compose a social network with non-symmetric relations, whose data can be modeled as a directed graph. Release of such data for scientific analysis requires a pre-processing for ensuring no private information about people will be disclosed. The measures to be taken depend on the previous structural information a dishonest analyst is assumed to have. In this paper, the considered previous information is the sorting of blogs according to their PageRank relevance, which can be obtained by querying the blogging platform search engine. After analyzing the scenario, the n-rank confusion model is proposed. Experimental results show this model achieves a high privacy protection level while preserving the structural parameters of directed graph data to a high extent.

论文关键词:Blogosphere,Directed graph,PageRank,Privacy

论文评审过程:Received 14 February 2012, Revised 28 November 2012, Accepted 10 January 2013, Available online 28 February 2013.

论文官网地址:https://doi.org/10.1016/j.ipm.2013.01.002