An experimental comparison of real and artificial deception using a deception generation model

作者:

Highlights:

摘要

To develop a data mining approach for a deception application, data collection costs can be prohibitive because both deceptive data and truthful data are necessary to be collected. To reduce data collection costs, artificially generated deception data can be used, but the impact of using artificially generated deception data is not well understood. To study the relationship between artificial and real deception, this paper presents an experimental comparison using a novel deception generation model. The deception and truth data were collected from financial aid applications, a document centric area with limited resources for verification. The data collection provided a unique data set containing truth, natural deception, and boosted deception. To simulate deception, the Application Deception Model was developed to generate artificial deception in different deception scenarios. To study differences between artificial and real deception, an experiment was performed using deception level and data generation method as factors and directed distance and outlier score as outcome variables. Our results provided evidence of a reasonable similarity between artificial and real deception, suggesting the possibility of using artificially generated deception to reduce the costs associated with obtaining training data.

论文关键词:Deception,Deception detection,Noise model,Data generation model

论文评审过程:Received 6 April 2011, Revised 23 March 2012, Accepted 29 April 2012, Available online 5 May 2012.

论文官网地址:https://doi.org/10.1016/j.dss.2012.04.009