Soft computing based imputation and hybrid data and text mining: The case of predicting the severity of phishing alerts

作者:

Highlights:

摘要

In this paper, we employ a novel two-stage soft computing approach for data imputation to assess the severity of phishing attacks. The imputation method involves K-means algorithm and multilayer perceptron (MLP) working in tandem. The hybrid is applied to replace the missing values of financial data which is used for predicting the severity of phishing attacks in financial firms. After imputing the missing values, we mine the financial data related to the firms along with the structured form of the textual data using multilayer perceptron (MLP), probabilistic neural network (PNN) and decision trees (DT) separately. Of particular significance is the overall classification accuracy of 81.80%, 82.58%, and 82.19% obtained using MLP, PNN, and DT respectively. It is observed that the present results outperform those of prior research. The overall classification accuracies for the three risk levels of phishing attacks using the classifiers MLP, PNN, and DT are also superior.

论文关键词:Data imputation,K-means clustering,Multilayer perceptron,Phishing alerts,Probabilistic neural networks,Text mining

论文评审过程:Available online 3 March 2012.

论文官网地址:https://doi.org/10.1016/j.eswa.2012.02.138