Accurate privacy-preserving record linkage for databases with missing values

作者:

Highlights:

• Privacy-preserving record linkage (PPRL) aims to link sensitive data across databases.

• The PPRL process can be challenged by missing data, leading to poor linkage quality.

• We propose a novel Bloom filter based PPRL approach to improve linkage quality.

• We develop two methods to group encoded records based on their missingness patterns.

• The methods have trade-offs between the number of communication and comparison steps.

• We compare our methods to existing PPRL approaches in terms of quality and privacy.

• Experiments on real databases show our methods outperforms existing approaches.

摘要

•Privacy-preserving record linkage (PPRL) aims to link sensitive data across databases.•The PPRL process can be challenged by missing data, leading to poor linkage quality.•We propose a novel Bloom filter based PPRL approach to improve linkage quality.•We develop two methods to group encoded records based on their missingness patterns.•The methods have trade-offs between the number of communication and comparison steps.•We compare our methods to existing PPRL approaches in terms of quality and privacy.•Experiments on real databases show our methods outperforms existing approaches.

论文关键词:Missing data,Privacy,Entity resolution,Data linkage,Bloom filter encoding

论文评审过程:Received 13 September 2020, Revised 8 November 2021, Accepted 22 November 2021, Available online 16 December 2021, Version of Record 13 January 2022.

论文官网地址:https://doi.org/10.1016/j.is.2021.101959