Matching user identities across social networks with limited profile data

作者:Ildar Nurgaliev, Qiang Qu, Seyed Mojtaba Hosseini Bamakan, Muhammad Muzammal

摘要

Privacy preservation is a primary concern in social networks which employ a variety of privacy preservations mechanisms to preserve and protect sensitive user information including age, location, education, interests, and others. The task of matching user identities across different social networks is considered a challenging task. In this work, we propose an algorithm to reveal user identities as a set of linked accounts from different social networks using limited user profile data, i.e., user-name and friendship. Thus, we propose a framework, ExpandUIL, that includes three standalone algorithms based on (i) the percolation graph matching in ExpandFullName algorithm, (ii) a supervised machine learning algorithm that works with the graph embedding, and (iii) a combination of the two, ExpandUserLinkage algorithm. The proposed framework as a set of algorithms is significant as, (i) it is based on the network topology and requires only name feature of the nodes, (ii) it requires a considerably low initial seed, as low as one initial seed suffices, (iii) it is iterative and scalable with applicability to online incoming stream graphs, and (iv) it has an experimental proof of stability over a real ground-truth dataset. Experiments on real datasets, Instagram and VK social networks, show upto 75% recall for linked accounts with 96% accuracy using only one given seed pair.

论文关键词:social networks, user identity linkage, graph structure learning, maximum subgraph matching, graph percolation

论文评审过程:

论文官网地址:https://doi.org/10.1007/s11704-019-8235-9