首页|The cost of quality: Implementing generalization and suppression for anonymizing biomedical data with minimal information loss

The cost of quality: Implementing generalization and suppression for anonymizing biomedical data with minimal information loss

扫码查看
Objective: With the ARX data anonymization tool structured biomedical data can be de-identified using syntactic privacy models, such as k-anonymity. Data is transformed with two methods: (a) generalization of attribute values, followed by (b) suppression of data records. The former method results in data that is well suited for analyses by epidemiologists, while the latter method significantly reduces loss of information. Our tool uses an optimal anonymization algorithm that maximizes output utility according to a given measure. To achieve scalability, existing optimal anonymization algorithms exclude parts of the search space by predicting the outcome of data transformations regarding privacy and utility without explicitly applying them to the input dataset. These optimizations cannot be used if data is transformed with generalization and suppression. As optimal data utility and scalability are important for anonymizing biomedical data, we had to develop a novel method.

SecurityPrivacyDe-identificationAnonymizationStatistical disclosure controlOptimization

2015

Journal of biomedical informatics.

Journal of biomedical informatics.

ISSN:1532-0464
年,卷(期):2015.58
  • 6
  • 35