Charity begins at home. So does security.. Redundancy Information Leakage in Fine-grained Access...
-
Upload
tyler-wilkerson -
Category
Documents
-
view
215 -
download
0
description
Transcript of Charity begins at home. So does security.. Redundancy Information Leakage in Fine-grained Access...
Charity begins at home.Charity begins at home.
So does security.
Redundancy & Information Leakagein Fine-grained Access Control
Seminar presented for CS 632 by:Aditya Joshi Subhajit Datta
08305908 09305051
Course Instructor: Prof. S Sudarshan
Based on paper & slides byGovind Kabra (Univ of Illinois, Urbana-Champaign)
Ravi Ramamurthy (Microsoft Research)S. Sudarshan (IIT Bombay)
3 / 53
Contents
Introduction Redundancy Removal Information leakage
Integrating RR &Safe plans
Conclusion & Future Work
4 / 53
Contents
Introduction Redundancy Removal Information leakage
Integrating RR &Safe plans
Conclusion & Future Work
Introduction
5 / 53
Redundancy & Information Leakage in Fine grained access control
Redundancy & Information Leakage in Fine grained access control
• Conventional Access control: Table/column level
•Fine grained access control: Access control at a lower level of granularity
• grant select on employee(name) to public• Views
6 / 53
ExampleRoll_no Courseid Grade
1 101 AA
1 102 AB
2 101 AB
2 103 BC
3 101 AA
Professor teaching 101
Student
Roll number 1
select * from grades;
7 / 53
select * from grades;‘grades that I am allowed to see’
• Replace relation R in the query by RA
• RA is an authorization view
• Query Rewriting
User Query:select * from lineitemwhere shipmode=‘express’
σ
L
σ
authL σ
O
σ
L
So what is query rewriting?
8 / 53
Functions associated with each relation which return strings of predicates.
Existing systems
• Oracle’s Virtual Private Database (VPD)• LeFevre et al
Replacing unauthorized values with null
9 / 53
Types of models
• Truman Models Uses query rewriting
• Non-Truman Models Valid if it can be rewritten with authorized views.
Invalid queries rejected
10 / 53
Two Semantics
• The Truman Model = filter semantics
• The non-Truman model = deny semantics
Based on: www.cs.washington.edu/homes/suciu/current-trends.ppt
T MQ Q’
nT MQ accepted
11 / 53
Query Rewriting• Authorized Views
CREATE VIEW auth_Ri AS SELECT Li FROM Ri WHERE Pi
Li contains expressions implementing cell level access-controlPi has the authorization predicates (may have sub-queries)
• Query using such viewsRi Aiwhere Ai is an expression containing the sub-queries in Pi
R1 R2 …. Rn (R1 A1) (R2 A2) …
θi
θ1 θ2
12 / 53
Query Rewriting• Authorized Views
CREATE VIEW authGrades AS SELECT * FROM GRADES g1 WHERE EXISTS (SELECT *
FROM FACULTYCOURSES f1 WHERE FACID=getFacID() and g1.courseid=f1.courseid)
• Query using such viewsRi Aiwhere Ai is an expression containing the sub-queries in Pi R1 R2 …. Rn (R1 A1) (R2 A2) …
θi
θ1 θ2
13 / 53
Redundancy & Information Leakage in Fine grained access control
Redundancy & Information Leakage in Fine grained access control
• Redundancy
• Information Leakage
14 / 53
Contents
Introduction Redundancy Removal Information leakage
Integrating RR &Safe plans
Conclusion & Future Work
Redundancy Removal
15 / 53
Redundancy Removal
• Intuition: Most queries already access authorized data
• Will adding authorization views cause redundancy?
16 / 53
Redundancy exampleSelect * from grades G, facultycourses F
where G.courseid=F.courseidand F.facid=‘123’and G.year=2010
σ
F
σ
G
σ
F
σ
authGσ
F
σ
F
σ
G
authG: Grades that a faculty is allowed to see
17 / 53
In general, RR is equivalent to query minimization
Heuristic approach: eliminate redundant semi-joins If E2 subsumes E1, then transform E1 E2 to E1
Added transformation rules in a rule based optimizer
Redundancy detection and removal-I
σ
F
σ
F
σ
G Apply RRσ
F
σ
G
E1 E2
18 / 53
Redundancy detection and removal-II
Subsumption Test E2 subsumes E1 in E1 E2 if
The predicates in selection of E2 are weaker than corresponding predicates in E1
The semi-join condition in equates the columns of E1 and E2 that are equivalent under the mapping.
θi
θi
19 / 53
Redundancy detection and removal-II
Rule to detect and remove redundancy: If E2 subsumes E1 then replace E1 E2 by E1 In case of disjunction of sub-query expression:
Apply subsumption test to each disjunct If any one is found to subsume E1, then discard the complete set of semi-joins.
θi
20 / 53
RR at different levels
• Transformation phase:– Explores all possibilities of redundancy– Inefficient
• Simplification Phase : Normalized form by pulling up semi-joins.– Linear number of authorization checks– Depends on order of Ai’s– Easy to integrate with existing optimizers.
21 / 53
During simplification phase
E1 E2
E1 E2
22 / 53
TPC-H Benchmark Queries, with authorization checks
Comparing normalized execution times
Performance benefits of RR
TPCH Query Execution Time Without RR
Execution Time With RR
Query 3 100.00 48.28Query 6 56.03 38.79Query 10 94.83 55.45Query 12 77.57 43.97
Query 14 49.14 38.79
23 / 53
Simplification versus transformation
Performance benefits of RR
24 / 53
RR for non-Truman model
• Perform redundancy removal• If query remains the same, it is indeed a valid
query
nT MQ accepted
25 / 53
Redundancy & Information Leakage in Fine grained access
control
Redundancy & Information Leakage in Fine grained access
control
• Redundancy
• Information Leakage
26 / 53
Contents
Introduction Redundancy Removal Information leakage
Integrating RR &Safe plans
Conclusion & Future Work
Information leakage
27 / 53
Information leakage via UDFs
• UDF may expose the values of the table– May print out values– Save the values to a table
σmyudf(E.salary)
myemployees
σmyudf(E.salary)
employees A1
σmyudf(E.salary)
employees
A1
28 / 53
Exceptions Query: select * from employee
where 1/(salary-100K) = 0.23 Divide by zero exception if salary = 100K
Error Messages to_Integer function may throw error revealing the content
Timing Analysis Sub-query can perform an expensive computation only if certain tuples
are present in its input.
Other channels of information leakage
29 / 53
UDFonTop: Keep UDFs at the top of query plan Definitely safe, no information leakage Better plans possible if UDF is selective
Optimal Safe plan When is a plan safe? How to search for optimal plan amongst alternative safe plans?
Preventing Information Leakage via UDFs
σmyudf(E.salary)
employees
A1
σmyudf(E.salary)
employees A1
30 / 53
Safe plans w.r.t. UDFs Approach 1: If UDF uses attributes from R, apply authorization
checks for R before UDF Not sufficient; Full expression must be authorized Expression that can be rewritten using authorized views [RMSR04] How to efficiently infer which expressions are authorized?
Auth Views: employee (medical-record A2) Query: Find names of all employee having AIDS
σudf2(E.name)
σM.disease=‘AIDS’
medical-record A2
σudf2(E.name)
employees σM.disease=‘AIDS’
medical-record
A2
σudf2(E.name)
employees
σM.disease=‘AIDS’
medical-record
A2
employees
31 / 53
Some definitions Authorized Expression
An expression is authorized if it is equivalent to an expression defined using only authorized views.
Safety of query plan w.r.t. USF’sA node in a query plan is safe w.r.t. USF’s if: There are no USF’s in the node, and all inputs (if any) of the node
are all safe, or The node has a USF, it is not an apply operator, and all its inputs are
safe and authorized. The node is an apply operator, both its children are safe and either
Right child does not have any USF invocations, or The left child is authorized
Unsafe functions.
What are they?
32 / 53
Examples There are no USF’s in the node, and all inputs (if any) of the node
are all safe, or The node has a USF, it is not an apply operator, and all its inputs
are safe and authorized. The node is an apply operator, both its children are safe and either
Right child does not have any USF invocations, or The left child is authorized
There are no USF’s in the node, and all inputs (if any) of the node are all safe, or
The node has a USF, it is not an apply operator, and all its inputs are safe and authorized.
The node is an apply operator, both its children are safe and either Right child does not have any USF invocations, or The left child is authorized
There are no USF’s in the node, and all inputs (if any) of the node are all safe, or
The node has a USF, it is not an apply operator, and all its inputs are safe and authorized.
The node is an apply operator, both its children are safe and either Right child does not have any USF invocations, or The left child is authorized
σmyudf(E.salary)
employees A1
Apply
Safe Safe
There are no USF’s in the node, and all inputs (if any) of the node are all safe, or
The node has a USF, it is not an apply operator, and all its inputs are safe and authorized.
The node is an apply operator, both its children are safe and either Right child does not have any USF invocations, or The left child is authorized
If the right child does not have any USF invocation, the left child may not be authorized.
If the left child is authorized, right child may have USF invocations.
33 / 53
Framework of rule based optimizer
σ
employees
medical-records
Q1
G4
G2
G3
σemployees
medical-records
Q1G1
G5
G6
G7
G4
G2
G3
σemployees
medical-records
Q1
G5
G6
G1
A DAG-like structure.
Equivalence nodes : Group node
34 / 53
Inferring authorization of expressions Authorization as a logical property of group
Start with the rewritten query:
Mark groups containing original authorization views as authorized
35 / 53
Rule IA
• If all the children group nodes of an operation node are authorized, the parent-group-node of that operation node are also marked as authorized.– Propagate authorization upwards to the parent
groups• A node which is not authorized initially may be inferred
as authorized later.• This information must be propagated to the parents of
the node
36 / 53
Inferring authorization of expressions Authorization as a logical property of group Start with the rewritten query:
Mark groups containing original authorization views as authorized
Propagate authorization upwards to the parent groups
σ
employees
medical-records
Q1
G4
G2
G3
σemployees
medical-records
Q1G1
G5
G6
G7
G4
G2
G3
σemployees
medical-records
Q1
G5
G6
G1
G5
G1
G6
G5G7
G1 G4
G2
G6
G3
σemployees
medical-records
Q1
37 / 53
Extending optimizer to find optimal safe plan
There are two approaches to find the optimal safe plan:
Only Safe Transformations Allow UDF push-down/pull-up only on top of authorized expressions Only safe alternatives are present in memo, pick the optimal plan
Pick Safe Plan Allow all transformations for UDF Use “required/derived feature” to pick only plans where UDF are on
top of authorized expression
38 / 53
Both RR and Optimal Safe Plan are necessary: Motivation
No RR With RR
UDF on top 100 47.83
Safe Optimal 53.25 23.25
Comparing normalized execution times.
39 / 53
Contents
Introduction Redundancy Removal Information leakage
Integrating RR &Safe plans
Conclusion & Future Work
Integrating RR & Safe plans
40 / 53
Integrating RR and Optimal safe plan
Rule-based optimizers involve a simplification phase followed by a transformation phase RR in simplification reduces query size and optimization time
But RR in simplification interferes with safety inference Optimal safe plan generation requires preserving
the following input plan until memo is created
RR can possibly remove some Ai Possible integration:
RR in transformation phase RR in simplification phase with conditioned authorization for safe plan
generation
41 / 53
RR during Transformation Phase
Introduce authorization-anchor nodes These prevent transformations that pull-up Ri or Ai’s or push down any
operation into the semi-join
At start of transformation, we remove these nodes, mark them as authorized, perform authorization propagation.
σudf2(E.name)
σM.disease=‘AIDS’
medical-record A2
employees
42 / 53
RR during Transformation Phase
Introduce authorization-anchor nodes These prevent transformations that pull-up Ri or Ai’s or push down any
operation into the semi-join
At start of transformation, we remove these nodes, mark them as authorized, perform authorization propagation.
Then RR rules are applied
Disadvantage: Increased optimization time due to multiple redundancy checks of semi-
joins.
43 / 53
RR in simplification phase with conditioned authorization Instead of marking an expression authorized, we mark it as
conditioned-authorized. For eg.: we have a relation Ri with authorization Ai
Ai could be removed/ moved elsewhere by Ri So we mark Ri as authorized conditioned on Ai i.e. Conditioned on it being semi-join/joined with Ai
44 / 53
RR in simplification phase with conditioned authorization
If simplification results in a empty condition, we can infer that the expression is unconditionally authorized.
For a group: If any of the child is unconditionally authorized, so is the group.
G1
E1 E2 E3unconditional
unconditional
45 / 53
RR in simplification phase with conditioned authorization
If expression E is of the form E1 E2, where E1 is authorized conditioned on Ai and E2 is equivalent to Bj Ai, then We infer that resultant expression is unconditionally authorized.
E1 E2Ai
Bj Ai
unconditional
46 / 53
Rule for propagation authorization
The extended propagation rule is: If operation has two groups E1 and E2 each authorized on
A1 and A2 resp., then result is authorized conditioned on A1 and A2
If A1 subsumes E2, we drop A1 from the condition.
47 / 53
Handling Exceptions and Error Messages
For each built-in function, we create a safe version of the function that ignores exceptions and does not output error.
Predicates using USF’s are rewritten using the corresponding safe version.We can create a safe version of division function, which catches exception and returns a null value.for the predicate (1/(salary-100K)==0.2) we can use this safety predicate. This may allow unauthorized tuples to pass through. However, we can
rewrite such that it is weaker than the original condition.
We can push down the safe predicates while retaining the unsafe version on top.
48 / 53
Performance Evaluation
Study utility of RR and Optimal Safe Plan Auth: Managers can see information only pertinent to their
region authNation: Nation ( (Region)) authCustomer: Customer (Nation ( (Region))) …
Query: Find supplier who fulfils “important” orders
AuthorizationView replacement
σσ
49 / 53
Both RR and Optimal Safe Plan are necessary
Safe Optimal
UDF On Top
No RR
Apply RR
Apply Both
47.83
23.25
100.00
53.25
50 / 53
Contents
Introduction Redundancy Removal Information leakage
Integrating RR &Safe plans
Conclusion & Future WorkConclusion &Future Work
51 / 53
Future Work
• Study conditioned authorization to reduce optimization time
• Better solution for timing analysis based information leakage
• Add rules for handling authorizations involving nullification and aggregation
52 / 53
Conclusion Redundancy in queries
Transformation rules for redundancy removal
Information leakage Definition of a safe plan Extending optimizer for generating optimal safe plan
Preliminary performance study of proposed techniques Ensure safety while providing significant performance benefits
53 / 53
Questions?
Feedback?