首页|基于非定长编码和滑动窗口的隐私保护记录链接方法

基于非定长编码和滑动窗口的隐私保护记录链接方法

扫码查看
隐私保护记录链接(PPRL)是一种跨不同数据库高效识别同一实体对象对应的记录而不泄露记录所代表实体对象的敏感或机密信息的方法。布隆过滤器(BF)广泛应用于PPRL,其将记录中的敏感信息进行编码并使用字符q-gram实现近似匹配。但是,BF编码容易遭受密码分析攻击,且由于对q-gram位置不敏感,会导致记录匹配的精确率较低。提出一种基于非定长编码和滑动窗口的PPRL方法,其采用的非定长编码记录生成方式不仅使记录具有位置敏感性,而且通过对有效位前后添加随机位数组隐藏了实体的位数组频率信息,从而能够有效防御频率攻击。此外,设计一种基于滑动窗口的记录链接方式,先通过快速过滤筛除大量不匹配的记录,再使用双向滑动窗口的精确匹配策略对剩余记录进行匹配,提高隐私保护记录的匹配效率。在公开数据集上的实验结果表明,相比BF方法,该方法在编码速度上快100倍左右,其同时具有更高的匹配精度,在跨数据库PPRL方面的安全性也更强。
Privacy-Preserving Record Linkage Method Based on Variable-Length Coding and Sliding Window
Privacy-Preserving Record Linkage(PPRL)refers to the efficient identification of records corresponding to the same entity object across different databases without revealing sensitive or confidential information represented by the records.Bloom Filter(BF)is a widely used technique in PPRL,which encodes sensitive information in records and uses q-gram for approximate matching.However,BF encoding is vulnerable to cryptanalysis attacks,and its insensitivity to the q-gram position can result in a decrease in the precision of record matching.This study proposes a PPRL method based on variable-length coding and sliding window techniques.The method for generating the variable-length encoding record used in the method not only makes the record position-sensitive but also hides the frequency information of entity bit arrays by adding random bit arrays before and after the effective bits.This effectively defends against frequency attacks.In addition,a record linkage method based on sliding windows is designed,which first filters out a large number of non-matching records through a fast filter and then uses a bidirectional sliding window exact-matching strategy to match the remaining records.This improves the matching efficiency of the privacy-preserving records.The experimental results on public datasets show that the proposed method is approximately 100 times faster in encoding the speed than the BF method and has higher matching accuracy.It also has stronger security in cross-database PPRL.

Bloom Filter(BF)string comparisonprivacy protectionrecord linkagesecure entity alignment

叶晓东、赵迎迎、孙永奇、赵思聪、刘真

展开 >

北京交通大学计算机与信息技术学院,北京 100044

交通大数据与人工智能教育部重点实验室,北京 100044

北京航天晨信科技有限责任公司,北京 102308

布隆过滤器 字符串比较 隐私保护 记录链接 安全实体对齐

科技创新2030—"新一代人工智能"重大项目

2021ZD0113002

2024

计算机工程
华东计算技术研究所 上海市计算机学会

计算机工程

CSTPCD北大核心
影响因子:0.581
ISSN:1000-3428
年,卷(期):2024.50(2)
  • 2