Spawning Ground Survey Database Quality Assurance and Quality Control
Database Access Control & Privacy: Is There A Common Ground?
description
Transcript of Database Access Control & Privacy: Is There A Common Ground?
![Page 1: Database Access Control & Privacy: Is There A Common Ground?](https://reader034.fdocuments.us/reader034/viewer/2022051118/568165b0550346895dd8a268/html5/thumbnails/1.jpg)
Database Access Control & Privacy: Is There A Common Ground?
Surajit Chaudhuri, Raghav Kaushik and Ravi RamamurthyMicrosoft Research
![Page 2: Database Access Control & Privacy: Is There A Common Ground?](https://reader034.fdocuments.us/reader034/viewer/2022051118/568165b0550346895dd8a268/html5/thumbnails/2.jpg)
2
Data Privacy Databases Have Sensitive Information
Health care database: Patient PII, Disease information Sales database: Customer PII Employee database: Employee level, salary
Data analysis carries the risk of privacy breach [FTDB 2009] Latanya Sweeney’s identification of the governor of MA
from medical records AOL search logs Netflix prize dataset
Focus of this paper: What is the implication of data privacy concerns on the DBMS? Do we need any more than access control?
![Page 3: Database Access Control & Privacy: Is There A Common Ground?](https://reader034.fdocuments.us/reader034/viewer/2022051118/568165b0550346895dd8a268/html5/thumbnails/3.jpg)
3
Data PublishingName
Age Gender
Zipcode
Disease
Ann 28 F 13068 Heart disease
Bob 21 M 13068 FluCarol 24 F 13068 Viral disease… … … … …
Patients [FTDB2009]
Age Gender
Zipcode
Disease
[20-29]
F 1**** Heart disease
[20-29]
M 1**** Flu
[20-29]
F 1**** Viral disease
… … … …
Patients-AnonymizedQ1
Qn
...
K-Anonymity, L-Diversity, T-Closeness
![Page 4: Database Access Control & Privacy: Is There A Common Ground?](https://reader034.fdocuments.us/reader034/viewer/2022051118/568165b0550346895dd8a268/html5/thumbnails/4.jpg)
4
Privacy-Aware Query AnsweringName
Age Gender
Zipcode
Disease
Ann 28 F 13068 Heart disease
Bob 21 M 13068 FluCarol 24 F 13068 Viral disease… … … … …
Patients [FTDB2009]
Age Gender
Zipcode
Disease
[20-29]
F 1**** Heart disease
[20-29]
M 1**** Flu
[20-29]
F 1**** Viral disease
… … … …
Patients-Anonymized
Q1
Qn
...
Differential Privacy, Privacy-Preserving OLAP
![Page 5: Database Access Control & Privacy: Is There A Common Ground?](https://reader034.fdocuments.us/reader034/viewer/2022051118/568165b0550346895dd8a268/html5/thumbnails/5.jpg)
5
Data Publishing Vs Query Answering Jury is still out Data Publishing
No impact on DBMS De-identification algorithms over published data
are getting increasingly sophisticated Need to take a hard look at the query
answering paradigm Potential implications for DBMS “An interactive, query-based approach is
generally superior from the privacy perspective to the “release-and-forget” approach” [CACM’10]
![Page 6: Database Access Control & Privacy: Is There A Common Ground?](https://reader034.fdocuments.us/reader034/viewer/2022051118/568165b0550346895dd8a268/html5/thumbnails/6.jpg)
6
Is “Privacy-Aware” = (Fine-Grained) Access Control (FGA)? Every user is allowed to view only subset of data
(authorization view) Subset defined using a predicate
Queries are (logically) rewritten to go against subset
Select *From PatientsWhere Patients.Physician = userID()
![Page 7: Database Access Control & Privacy: Is There A Common Ground?](https://reader034.fdocuments.us/reader034/viewer/2022051118/568165b0550346895dd8a268/html5/thumbnails/7.jpg)
7
Is “Privacy-Aware” = (Fine-Grained) Access Control (FGA)? Every user is allowed to view only subset of data
(authorization view) Subset defined using a predicate
Queries are (logically) rewritten to go against subsetSelect Drug, count(*)From Patients right outer join Drugs on DrugWhere (Select count(*) From Side-Effects Where Drug = Drugs.Drug) > 3Group by Drug
Select Drug, count(*)From Patients right outer join Drugs on
DrugWhere (Select count(*) From Side-Effects Where Drug = Drugs.Drug and auth(Side-Effects)) > 3 and auth(Patients) and auth(Drugs)Group by Drug
![Page 8: Database Access Control & Privacy: Is There A Common Ground?](https://reader034.fdocuments.us/reader034/viewer/2022051118/568165b0550346895dd8a268/html5/thumbnails/8.jpg)
8
Authorization is “Black and White”
Query: Count the number of cancer patients
Utility
Privacy Grant access to cancer patients(Return accurate count)
Deny access to cancer patients
![Page 9: Database Access Control & Privacy: Is There A Common Ground?](https://reader034.fdocuments.us/reader034/viewer/2022051118/568165b0550346895dd8a268/html5/thumbnails/9.jpg)
9
Beyond “Black and White”: Differential Privacy [SIGMOD09]
Perturb the output of agg. computation(Requires no changein execution engine)
Need to setparameters ε,Budget
Count the number of cancer patients
BaggageNon-deterministicPer-query privacy parameterOverall privacy budget
![Page 10: Database Access Control & Privacy: Is There A Common Ground?](https://reader034.fdocuments.us/reader034/viewer/2022051118/568165b0550346895dd8a268/html5/thumbnails/10.jpg)
10
Seeking Common Ground Access Control
Supports full generality of SQL “Black and White”
Differential Privacy Algorithms A principled way to go beyond “black and white” Known mechanisms do not support full generality of SQL Data analysis involves aggregation but also joins, sub-
queries Can we get the best of both worlds?
Differential Privacy = Computation on unauthorized data What is the implication on privacy guarantees?
![Page 11: Database Access Control & Privacy: Is There A Common Ground?](https://reader034.fdocuments.us/reader034/viewer/2022051118/568165b0550346895dd8a268/html5/thumbnails/11.jpg)
What Does “Best of Both Worlds” Look Like?
FGA Policy: Each physician can see:
Records of their patients Analyst can see:
Drug records manufactured by their employer
No patient records
Name
Disease
Drug Physician
Ann Heart disease
Lipitor Grey
… … … …
Drug Company
Lipitor
Pfizer
… …
PatientsDrug Side-
Effect
Lipitor
Muscle
Lipitor
Liver
… …
Drugs Side-Effects
Name Employer
JoeAnalyst
Pfizer
JaneAnalyst
Merck
… …
Analysts
![Page 12: Database Access Control & Privacy: Is There A Common Ground?](https://reader034.fdocuments.us/reader034/viewer/2022051118/568165b0550346895dd8a268/html5/thumbnails/12.jpg)
12
FGA
Name Disease Drug Physician
… … … Grey
… … … Grey
… … … Stevens
… … … Stevens
… … … Yang
Select *From Patients
Select *From PatientsWhere Physician = userID()
Grey
![Page 13: Database Access Control & Privacy: Is There A Common Ground?](https://reader034.fdocuments.us/reader034/viewer/2022051118/568165b0550346895dd8a268/html5/thumbnails/13.jpg)
13
Differential Privacy
Name Disease Drug Physician
… Heart Disease
… …
… Flu … …
… Cancer … …
… Cancer … …
… AIDS … …
Select count(*)From PatientsWhere Disease = ‘Cancer’
Select count(*) + NoiseFrom PatientsWhere Disease = ‘Cancer’
User = JaneAnalyst
![Page 14: Database Access Control & Privacy: Is There A Common Ground?](https://reader034.fdocuments.us/reader034/viewer/2022051118/568165b0550346895dd8a268/html5/thumbnails/14.jpg)
14
Mix And Match: FGA + Differential Privacy
Find for each drug with more than 3 side-effects, count the number of patients who have been prescribed
Select Drug, count(*)From Patients right outer join Drugs on DrugWhere (Select count(*) From Side-Effects Where Drug = Drugs.Drug) > 3Group by Drug
Name
Disease
Drug Physician
… … … …
… … … …
Drug Company
Lipitor
Pfizer
… …
PatientsDrug Side-
Effect
Lipitor
Muscle
Lipitor
Liver
… …
Drugs Side-Effects
Name Employer
JoeAnalyst
Pfizer
JaneAnalyst
Merck
… …
Analysts
![Page 15: Database Access Control & Privacy: Is There A Common Ground?](https://reader034.fdocuments.us/reader034/viewer/2022051118/568165b0550346895dd8a268/html5/thumbnails/15.jpg)
15
Architecture That Will Fail To Mix And Match
Execution Engine
Authorization Subsystem
Q
Policy
Result(AggQ)
ResultsDifferential Privacy API
AggQ
AggQ
Result(AggQ) + Noise
DBMS
![Page 16: Database Access Control & Privacy: Is There A Common Ground?](https://reader034.fdocuments.us/reader034/viewer/2022051118/568165b0550346895dd8a268/html5/thumbnails/16.jpg)
16
Execution Engine
Authorization Subsystem
Q
PolicyResult(AggQ)
Results
Differential Privacy APIAggQ
Result(AggQ) + Noise
DBMS
Wrapper
Architecture That Will Fail To Mix And Match
![Page 17: Database Access Control & Privacy: Is There A Common Ground?](https://reader034.fdocuments.us/reader034/viewer/2022051118/568165b0550346895dd8a268/html5/thumbnails/17.jpg)
17
Authorization-Aware Data Privacy
Execution Engine
Authorization Aware Privacy Subsystem
Q
Policy
Results
DBMS
![Page 18: Database Access Control & Privacy: Is There A Common Ground?](https://reader034.fdocuments.us/reader034/viewer/2022051118/568165b0550346895dd8a268/html5/thumbnails/18.jpg)
18
Query Rewriting
Select Drug, count(*)From Patients right outer join Drugs on DrugWhere (Select count(*) From Side-Effects Where Drug = Drugs.Drug) > 3Group by Drug
Name
Disease
Drug Physician
… … … …
… … … …
Drug Company
Lipitor
Pfizer
… …
PatientsDrug Side-
Effect
Lipitor
Muscle
Lipitor
Liver
… …
Drugs Side-Effects
Name Employer
JoeAnalyst
Pfizer
JaneAnalyst
Merck
… …
Analysts
Non-aggregation: AuthorizationWhat about aggregation?
![Page 19: Database Access Control & Privacy: Is There A Common Ground?](https://reader034.fdocuments.us/reader034/viewer/2022051118/568165b0550346895dd8a268/html5/thumbnails/19.jpg)
19
Query Rewriting
Select Drug, count(*)From Patients right outer join Drugs on DrugWhere (Select count(*) From Side-Effects Where Drug = Drugs.Drug) > 3Group by Drug
Name
Disease
Drug Physician
… … … …
… … … …
Drug Company
Lipitor
Pfizer
… …
PatientsDrug Side-
Effect
Lipitor
Muscle
Lipitor
Liver
… …
Drugs Side-Effects
Name Employer
JoeAnalyst
Pfizer
JaneAnalyst
Merck
… …
Analysts
![Page 20: Database Access Control & Privacy: Is There A Common Ground?](https://reader034.fdocuments.us/reader034/viewer/2022051118/568165b0550346895dd8a268/html5/thumbnails/20.jpg)
20
Query Rewriting
Select Drug, count(*)From Patients right outer join Drugs on
DrugWhere (Select count(*) From Side-Effects Where Drug = Drugs.Drug and auth(Side-Effects)) > 3 and auth(Patients) and auth(Drugs)Group by Drug
Name
Disease
Drug Physician
… … … …
… … … …
Drug Company
Lipitor
Pfizer
… …
PatientsDrug Side-
Effect
Lipitor
Muscle
Lipitor
Liver
… …
Drugs Side-Effects
Name Employer
JoeAnalyst
Pfizer
JaneAnalyst
Merck
… …
Analysts
Authorized Groups
For each authorized group, find noisy count
![Page 21: Database Access Control & Privacy: Is There A Common Ground?](https://reader034.fdocuments.us/reader034/viewer/2022051118/568165b0550346895dd8a268/html5/thumbnails/21.jpg)
21
Query Rewriting
Select Drug, count(*)From Patients right outer join Drugs on
DrugWhere (Select count(*) From Side-Effects Where Drug = Drugs.Drug and auth(Side-Effects)) > 3 and auth(Patients) and auth(Drugs)Group by Drug
Name
Disease
Drug Physician
… … … …
… … … …
Drug Company
Lipitor
Pfizer
… …
PatientsDrug Side-
Effect
Lipitor
Muscle
Lipitor
Liver
… …
Drugs Side-Effects
Name Employer
JoeAnalyst
Pfizer
JaneAnalyst
Merck
… …
Analysts
Authorized Groups
For each authorized group, find:(1)Noisy count on unauthorized subset(2)Accurate count on authorized subset
![Page 22: Database Access Control & Privacy: Is There A Common Ground?](https://reader034.fdocuments.us/reader034/viewer/2022051118/568165b0550346895dd8a268/html5/thumbnails/22.jpg)
22
Class of Queries Select Drug, count(*) From Patients right outer join Drugs on Drug Where (Select count(*) From Side-Effects Where Drug = Drugs.Drug) > 3 Group by Drug
Foreign key join
Predicate
Grouping
Aggregation
Rewriting: Go to unauthorized data for final aggregation
Principled rewriting for arbitrary SQL: open problem
![Page 23: Database Access Control & Privacy: Is There A Common Ground?](https://reader034.fdocuments.us/reader034/viewer/2022051118/568165b0550346895dd8a268/html5/thumbnails/23.jpg)
23
Our Privacy Guarantee: Relative Differential Privacy Differential Privacy Intuition:
A computation is differentially private if its behavior is similar for any two databases D1and D2 that differ in a single record
Relative Differential Privacy Intuition: A computation is differentially private relative to
an authorization policy if its behavior is similar for any two databases D1and D2 that differ in a single
record and both result in the same authorization views
![Page 24: Database Access Control & Privacy: Is There A Common Ground?](https://reader034.fdocuments.us/reader034/viewer/2022051118/568165b0550346895dd8a268/html5/thumbnails/24.jpg)
24
Noisy ViewCreate noisy view DrugCounts(Drug, PatientCnt) as (Select Drug, count(*) From Patients right outer join Drugs on Drug Where (Select count(*) From Side-Effects Where Drug = Drugs.Drug) > 3 Group by Drug)
Named Non-deterministic Rewriting is authorization aware Can be part of grant-revoke statements just like regular
views
![Page 25: Database Access Control & Privacy: Is There A Common Ground?](https://reader034.fdocuments.us/reader034/viewer/2022051118/568165b0550346895dd8a268/html5/thumbnails/25.jpg)
25
Noisy View ExamplesSelect count(*)From PatientsWhere Disease =
‘Cancer’
Select Disease, count(*)From PatientsGroup by Disease
Select Category, count(*)From Patients join
DiseaseCategory on DiseaseGroup by Category
![Page 26: Database Access Control & Privacy: Is There A Common Ground?](https://reader034.fdocuments.us/reader034/viewer/2022051118/568165b0550346895dd8a268/html5/thumbnails/26.jpg)
26
Noisy View Architecture
Execution Engine
Authorization Aware Privacy Subsystem
Q
Policy
Results
Tables
Noisy Views
Views
Enforce authorization
Rewrite as we saw before
Select Drug, Side-Effect, CntFrom DrugCounts, Side-EffectsWhere DrugCounts.Drug = Side-
Effects.Drug
DBMS
![Page 27: Database Access Control & Privacy: Is There A Common Ground?](https://reader034.fdocuments.us/reader034/viewer/2022051118/568165b0550346895dd8a268/html5/thumbnails/27.jpg)
27
Differential Privacy Parameters [SIGMOD09]
Need to setparameters ε,Budget
![Page 28: Database Access Control & Privacy: Is There A Common Ground?](https://reader034.fdocuments.us/reader034/viewer/2022051118/568165b0550346895dd8a268/html5/thumbnails/28.jpg)
28
Noisy View Architecture: Differential Privacy Parameters
Execution Engine
Authorization Aware Privacy Subsystem
(Q, ε)
Auth. Policy,Privacy Budget
Results
Tables
Noisy Views
Views
Fall back to access controlafter budget exhausted
DBMS
![Page 29: Database Access Control & Privacy: Is There A Common Ground?](https://reader034.fdocuments.us/reader034/viewer/2022051118/568165b0550346895dd8a268/html5/thumbnails/29.jpg)
29
Conclusions and Future Work Noisy view based architecture to incorporate privacy-
preserving query answering with access control in a DBMS Based on differential privacy Needs minimal changes to engine Guarantee: Differential privacy relative to authorizations Baggage of differential privacy
Non-deterministic Per-query privacy parameter Overall privacy budget
Open Issues Larger class of noisy views (can we support arbitrary SQL?) Benchmark the privacy-utility tradeoff for complex data
analysis, e.g. TPC-H, TPC-DS. Query Optimization Integrating Access Control with other privacy models