On the efficacy of old features for the detection of new bots

作者:

Highlights:

摘要

For more than a decade now, academicians and online platform administrators have been studying solutions to the problem of bot detection. Bots are computer algorithms whose use is far from being benign: malicious bots are purposely created to distribute spam, sponsor public characters and, ultimately, induce a bias within the public opinion. To fight the bot invasion on our online ecosystem, several approaches have been implemented, mostly based on (supervised and unsupervised) classifiers, which adopt the most varied account features, from the simplest to the most expensive ones to be extracted from the raw data obtainable through the Twitter public APIs. In this exploratory study, using Twitter as a benchmark, we compare the performances of four state-of-art feature sets in detecting novel bots: one of the output scores of the popular bot detector Botometer, which considers more than 1,000 features of an account to take a decision; two feature sets based on the account profile and timeline; and the information about the Twitter client from which the user tweets. The results of our analysis, conducted on six recently released datasets of Twitter accounts, hint at the possible use of general-purpose classifiers and cheap-to-compute account features for the detection of evolved bots.

论文关键词:Novel bots,Bot classifiers,Performance evaluations,Features’ selection,Twitter

论文评审过程:Received 31 January 2021, Revised 30 June 2021, Accepted 1 July 2021, Available online 27 July 2021, Version of Record 27 July 2021.

论文官网地址:https://doi.org/10.1016/j.ipm.2021.102685