The cost of quality: Implementing generalization and suppression for anonymizing biomedical data with minimal information loss

扫码查看

原文链接

NSTL
Elsevier

外文摘要：Objective: With the ARX data anonymization tool structured biomedical data can be de-identified using syntactic privacy models, such as k-anonymity. Data is transformed with two methods: (a) generalization of attribute values, followed by (b) suppression of data records. The former method results in data that is well suited for analyses by epidemiologists, while the latter method significantly reduces loss of information. Our tool uses an optimal anonymization algorithm that maximizes output utility according to a given measure. To achieve scalability, existing optimal anonymization algorithms exclude parts of the search space by predicting the outcome of data transformations regarding privacy and utility without explicitly applying them to the input dataset. These optimizations cannot be used if data is transformed with generalization and suppression. As optimal data utility and scalability are important for anonymizing biomedical data, we had to develop a novel method.

外文关键词：

SecurityPrivacyDe-identificationAnonymizationStatistical disclosure controlOptimization

出版年：

2015

DOI：

10.1016/j.jbi.2015.09.007

Journal of biomedical informatics.

ISSN：1532-0464

年,卷(期)：2015.58

被引量6
参考文献量35