Accurate and large-scale privacy-preserving data mining using the election paradigm
作者:
Highlights:
•
摘要
With the proliferation of the Web and ICT technologies there have been concerns about the handling and use of sensitive information by data mining systems. Recent research has focused on distributed environments where the participants in the system may also be mutually mistrustful. In this paper we discuss the design and security requirements for large-scale privacy-preserving data mining (PPDM) systems in a fully distributed setting, where each client possesses its own records of private data. To this end we argue in favor of using some well-known cryptographic primitives, borrowed from the literature on Internet elections. More specifically, our framework is based on the classical homomorphic election model, and particularly on an extension for supporting multi-candidate elections. We also review a recent scheme [Z. Yang, S. Zhong, R.N. Wright, Privacy-preserving classification of customer data without loss of accuracy, in: SDM’ 2005 SIAM International Conference on Data Mining, 2005] which was the first scheme that used the homomorphic encryption primitive for PPDM in the fully distributed setting. Finally, we show how our approach can be used as a building block to obtain Random Forests classification with enhanced prediction performance.
论文关键词:Security and privacy,Distributed data mining,Homomorphic encryption,Random Forests classification
论文评审过程:Available online 23 June 2009.
论文官网地址:https://doi.org/10.1016/j.datak.2009.06.003