Machine learning based phishing detection from URLs
作者:
Highlights:
• Use of 7 different classification algorithms and NLP based features.
• A Big URL Data Set is produced and shared (36,400 legitimate and 37,175 phishing).
• Real-time and language-independent classification algorithms.
• Feature-rich classifiers with Word Vectors, NLP-based and Hybrid features.
• The proposed approach reaches 97.98% accuracy rate.
摘要
•Use of 7 different classification algorithms and NLP based features.•A Big URL Data Set is produced and shared (36,400 legitimate and 37,175 phishing).•Real-time and language-independent classification algorithms.•Feature-rich classifiers with Word Vectors, NLP-based and Hybrid features.•The proposed approach reaches 97.98% accuracy rate.
论文关键词:Cyber security,Phishing attack,Machine learning,Classification algorithms,Cyber attack detection
论文评审过程:Received 7 May 2018, Revised 25 July 2018, Accepted 12 September 2018, Available online 18 September 2018, Version of Record 12 October 2018.
论文官网地址:https://doi.org/10.1016/j.eswa.2018.09.029