Impact of benign sample size on binary classification accuracy
作者:
Highlights:
• We propose a metric for accuracy degradation by increasing benign samples.
• Increasing the test benign sample size tenfold decreased the F1 score by 0.293.
• Using sufficient benign training samples mitigates accuracy degradation.
摘要
•We propose a metric for accuracy degradation by increasing benign samples.•Increasing the test benign sample size tenfold decreased the F1 score by 0.293.•Using sufficient benign training samples mitigates accuracy degradation.
论文关键词:Malware,Machine learning,Binary classification,Benign sample,Random forest,Support vector machine,XGBoost
论文评审过程:Received 25 November 2021, Revised 22 June 2022, Accepted 17 August 2022, Available online 27 August 2022, Version of Record 2 September 2022.
论文官网地址:https://doi.org/10.1016/j.eswa.2022.118630