Supervised learning model for identifying illegal activities in Bitcoin

作者:Pranav Nerurkar, Sunil Bhirud, Dhiren Patel, Romaric Ludinard, Yann Busnel, Saru Kumari

摘要

Since its inception in 2009, Bitcoin is mired in controversies for providing a haven for illegal activities. Several types of illicit users hide behind the blanket of anonymity. Uncovering these entities is key for forensic investigations. Current methods utilize machine learning for identifying these illicit entities. However, the existing approaches only focus on a limited category of illicit users. The current paper proposes to address the issue by implementing an ensemble of decision trees for supervised learning. More parameters allow the ensemble model to learn discriminating features that can categorize multiple groups of illicit users from licit users. To evaluate the model, a dataset of 1216 real-life entities on Bitcoin was extracted from the Blockchain. Nine Features were engineered to train the model for segregating 16 different licit-illicit categories of users. The proposed model provided a reliable tool for forensic study. Empirical evaluation of the proposed model vis-a-vis three existing benchmark models was performed to highlight its efficacy. Experiments showed that the specificity and sensitivity of the proposed model were comparable to other models. Due to higher parameters of the ensemble tree model, the classification accuracy was 0.91, with 95% CI - 0.8727, 0.9477. This was better than SVM and Logistic Regression, the two popular models in the literature and comparable to the Random Forest and XGBOOST model. CPU and RAM utilization were also monitored to demonstrate the usefulness of the proposed work for real-world deployment. RAM utilization for the proposed model was higher by 30-45% compared to the other three models. Hence, the proposed model is resource-intensive as it has higher parameters than the other three models. Higher parameters also result in higher accuracy of predictions.

论文关键词:Bitcoin, Fraud detection, Exploratory data analysis

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10489-020-02048-w