A novel framework for detecting social bots with deep neural networks and active learning

作者:

Highlights:

摘要

Microblogging is a popular online social network (OSN), which facilitates users to obtain and share news and information. Nevertheless, it is filled with a huge number of social bots that significantly disrupt the normal order of OSNs. Sina Weibo, one of the most popular Chinese OSNs in the world, is also seriously affected by social bots. With the growing development of social bots in Sina Weibo, they are increasingly indistinguishable from normal users, which presents more huge challenges in detecting social bots. Firstly, it is difficult to extract the features of social bots completely. Secondly, large-scale data collection and labeling of user data are extremely hard. Thirdly, the performance of classical classification approaches applied to social bot detection is not good enough. Therefore, this paper proposes a novel framework for detecting social bots in Sina Weibo based on deep neural networks and active learning (DABot). Specifically, 30 features from four categories, namely metadata-based, interaction-based, content-based, and timing-based are extracted to distinguish between social bots and normal users. Nine of these features are completely new features proposed in this paper. Moreover, active learning is employed to efficiently expand the labeled data. Then, a new deep neural network model called RGA is built to implement the detection of social bots, which makes use of a residual network (ResNet), a bidirectional gated recurrent unit (BiGRU), and an attention mechanism. After performance evaluation, the results show that DABot is more effective than the state-of-the-art baselines with the accuracy of 0.9887.

论文关键词:Online social networks,Social bots,Sina Weibo,Deep neural networks,Active learning

论文评审过程:Received 28 May 2020, Revised 26 September 2020, Accepted 12 October 2020, Available online 22 October 2020, Version of Record 8 November 2020.

论文官网地址:https://doi.org/10.1016/j.knosys.2020.106525