Practical protocol for Yao’s millionaires problem enables secure multi-party computation of metrics and efficient privacy-preserving k-NN for large data sets

作者:Artak Amirbekyan, Vladimir Estivill-Castro

摘要

Finding the nearest k objects to a query object is a fundamental operation for many data mining algorithms. With the recent interest in privacy, it is not surprising that there is strong interest in k-NN queries to enable clustering, classification and outlier-detection tasks. However, previous approaches to privacy-preserving k-NN have been costly and can only be realistically applied to small data sets. In this paper, we provide efficient solutions for k-NN queries for vertically partitioned data. We provide the first solution for the L ∞ (or Chessboard) metric as well as detailed privacy-preserving computation of all other Minkowski metrics. We enable privacy-preserving L ∞ by providing a practical approach to the Yao’s millionaires problem with more than two parties. This is based on a pragmatic and implementable solution to Yao’s millionaires problem with shares. We also provide privacy-preserving algorithms for combinations of local metrics into a global metric that handles the large dimensionality and diversity of attributes common in vertically partitioned data. To manage very large data sets, we provide a privacy-preserving SASH (a very successful data structure for associative queries in high dimensions). Besides providing a theoretical analysis, we illustrate the efficiency of our approach with an empirical evaluation.

论文关键词:Privacy-preserving data mining, Secure multi-party computation, Nearest-neighbour classification, Yao’s millionaires problem

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10115-009-0233-z