Privacy-preserving imputation of missing data

作者:

Highlights:

摘要

Handling missing data is a critical step to ensuring good results in data mining. Like most data mining algorithms, existing privacy-preserving data mining algorithms assume data is complete. In order to maintain privacy in the data mining process while cleaning data, privacy-preserving methods of data cleaning are required. In this paper, we address the problem of privacy-preserving data imputation of missing data. We present a privacy-preserving protocol for filling in missing values using a lazy decision-tree imputation algorithm for data that is horizontally partitioned between two parties. The participants of the protocol learn only the imputed values. The computed decision tree is not learned by either party.

论文关键词:Data cleaning,Data imputation,Privacy-preserving protocols

论文评审过程:Received 5 June 2007, Accepted 5 June 2007, Available online 18 July 2007.

论文官网地址:https://doi.org/10.1016/j.datak.2007.06.013