Prediction of drive-by download attacks on Twitter

作者：

Highlights：

•

摘要

The popularity of Twitter for information discovery, coupled with the automatic shortening of URLs to save space, given the 140 character limit, provides cybercriminals with an opportunity to obfuscate the URL of a malicious Web page within a tweet. Once the URL is obfuscated, the cybercriminal can lure a user to click on it with enticing text and images before carrying out a cyber attack using a malicious Web server. This is known as a drive-by download. In a drive-by download a user's computer system is infected while interacting with the malicious endpoint, often without them being made aware the attack has taken place. An attacker can gain control of the system by exploiting unpatched system vulnerabilities and this form of attack currently represents one of the most common methods employed. In this paper we build a machine learning model using machine activity data and tweet metadata to move beyond post-execution classification of such URLs as malicious, to predict a URL will be malicious with 0.99 F-measure (using 10-fold cross-validation) and 0.833 (using an unseen test set) at 1 s into the interaction with the URL. Thus, providing a basis from which to kill the connection to the server before an attack has completed and proactively blocking and preventing an attack, rather than reacting and repairing at a later date.

论文关键词：Cyber security,Drive-by download,Malware,Machine learning,Web security

论文评审过程：Received 8 August 2017, Revised 6 February 2018, Accepted 12 February 2018, Available online 24 February 2018, Version of Record 7 March 2019.

论文官网地址：https://doi.org/10.1016/j.ipm.2018.02.003