Arbitrarily distributed data-based recommendations with privacy

作者:

Highlights:

摘要

Collaborative filtering (CF) systems use customers' preferences about various products to offer recommendations. Providing accurate and reliable predictions is vital for both e-commerce companies and their customers. To offer such referrals, CF systems should have sufficient data. When data collected for CF purposes held by a central server, it is an easy task to provide recommendations. However, customers' preferences represented as ratings might be partitioned between two vendors. To supply trustworthy and correct predictions, such companies might desire to collaborate. Due to privacy concerns, financial fears, and legal issues; however, the parties may not want to disclose their data to each other.In this study, we scrutinize how to estimate item-based predictions on arbitrarily distributed data (ADD) between two e-commerce sites without deeply jeopardizing their privacy. We analyze our proposed scheme in terms of privacy; and demonstrate that the method does not intensely violate data owners' confidentiality. We conduct experiments using real data sets to show how coverage and quality of the predictions improve due to collaboration. We also investigate our scheme in terms of online performance; and demonstrate that supplementary online costs caused by privacy measures are negligible. Moreover, we perform trials to show how privacy concerns affect accuracy. Our results show that accuracy and coverage improve due to collaboration; and the proposed scheme is still able to offer truthful predictions with privacy concerns.

论文关键词:Privacy,Data mining,Arbitrarily distributed data,Collaborative filtering,Accuracy

论文评审过程:Received 5 January 2011, Revised 3 November 2011, Accepted 4 November 2011, Available online 12 November 2011.

论文官网地址:https://doi.org/10.1016/j.datak.2011.11.002