首页|The cost of quality: Implementing generalization and suppression for anonymizing biomedical data with minimal information loss
The cost of quality: Implementing generalization and suppression for anonymizing biomedical data with minimal information loss
扫码查看
点击上方二维码区域,可以放大扫码查看
原文链接
NSTL
Elsevier
Objective: With the ARX data anonymization tool structured biomedical data can be de-identified using syntactic privacy models, such as k-anonymity. Data is transformed with two methods: (a) generalization of attribute values, followed by (b) suppression of data records. The former method results in data that is well suited for analyses by epidemiologists, while the latter method significantly reduces loss of information. Our tool uses an optimal anonymization algorithm that maximizes output utility according to a given measure. To achieve scalability, existing optimal anonymization algorithms exclude parts of the search space by predicting the outcome of data transformations regarding privacy and utility without explicitly applying them to the input dataset. These optimizations cannot be used if data is transformed with generalization and suppression. As optimal data utility and scalability are important for anonymizing biomedical data, we had to develop a novel method.