1 Secure XML Querying with Security Views Wenfei Fan University of Edinburgh & Bell Laboratories...
-
date post
21-Dec-2015 -
Category
Documents
-
view
214 -
download
0
Transcript of 1 Secure XML Querying with Security Views Wenfei Fan University of Edinburgh & Bell Laboratories...
1
Secure XML Querying with Security Views
Wenfei Fan
University of Edinburgh & Bell Laboratories
Chee-Yong Chan
National University of Singapore
Minos Garofalakis
Bell Laboratories
2
The need for XML security
Data in XML format: Business information: confidential Health-care data: Patient Privacy Act, …
Access control: multiple groups simultaneously query the same XML document each user group has a different access-control policy
Enforcement of access-control policies:
XML Query Engine
user group 1 user group n. . .inaccessibleaccessible
3
Secure XML querying
For each user group of an XML document T, specify a access-control policy S, enforce S: for any query Q posted by the group over the
document T, Q(T) consists of only data accessible wrt S
Access control for XML: How to specify access policies at various levels of granularity? How to efficiently enforce those access policies?
XML Query Engine
user group
inaccessible
accessible
Q Q(T)
XML document T
4
Example: an XML document of patients
Document DTD D hospital patient*
patient SSN, name, record*
record date, diagnosis, treatment
treatment (trial + regular)
trial trName, treatment*
regular tname, bill
*
treatment
tname
*
trial
trName
hospital
SSN
patient
name record*
diagnosisdate
regular
bill
DTD graph
Access-control policies over docs of D: Doctors in the hospital are granted
access to all the data in the docs Insurance company is allowed to
access billing information only
5
Access-control policy for syndrome surveillance
patients: accessible to only those who are diagnosed to have a certain disease “DIS” (a constant)
records:
– only with diagnosis = “DIS”
– part of “DIS” records: date, diagnosis, treatment, tname
– denied from seeing whether a patient is in a clinical trail or not (trial, regular, trName)
– denied from accessing billing information
*
treatment
tname
*
trial
trName
hospital
SSN
patient
name record*
diagnosisdate
regular
bill
X
X X
X
6
Challenge: Access-control specification
various levels of granularity: restricting access to entire subtrees or specific elements
conditional access: e.g., a patient is accessible if and only if it has a descendant diagnosis = “DIS”
overriding: e.g., tname overrides the accessibility of its parent regular
inheritance: e.g., SSN and name inherit the accessibility of patient
*
treatment
tname
*
trial
trName
hospital
SSN
patient
name record*
diagnosisdate
regular
bill
conditionally accessible
7
Challenge: access-control enforcement
should not imply any drastic degradation in performance
Example: an XPath query Q posed by a syndrome surveillance group over a document T
//patient[name=`Joe’]//tname
access control requirement:
Q(T) {accessible tname}
enforcement: ensure that– all and only those Joe’s having a
descendant diagnosis = “DIS”, – all and only those records with
diagnosis = “DIS”
*
treatment
tname
*
trial
trName
hospital
SSN
patient
name record*
diagnosisdate
regular
bill
conditionally accessible
8
Challenge: schema availability
One needs schema information to facilitate query formulation and optimization
How to define a schema (DTD) characterizing all and only the accessible information, without security breach?
How to automatically derive such a DTD from the document DTD and an access-control specification?
XML DTD is far more complicated than its relational counterpart – recursive, nondeterministic
*
treatment
tname
*
trial
trName
hospital
SSN
patient
name record*
diagnosisdate
regular
bill
conditionally accessible
9
Previous proposals/standards for XML security
Dozens of models have been proposed for XML: XACML, XACL, …
Specifying and enforcing access-control at a physical level– annotate data nodes in an XML document with accessibility,
and check accessibility at runtime (with optimizations for tree-pattern queries and tree/DAG DTDs), or
– materialize a view consisting of accessible data
Problems:– costly (time, space): multiple accessibility annotations/views– error-prone: integrity maintenance becomes a problem when
the underlying data or access policy is updated
No support for schema availability: either deny access to any schema information, or expose the entire document DTD --
security breach
10
A seemingly plausible model
annotate data nodes with accessibility check accessibility at runtime, and expose the document DTD D
Example: permissible XPath queries: Q1://patient[name=`Joe’]/record
/treatment/*/tname Q2://patient[name=`Joe’]/
record /treatment//tname
Security breach: from the document DTD it follows that if Q2(T) – Q1(T) is nonempty then Joe is involved in a clinical trial
*
treatment
tname
*
trial
trName
hospital
SSN
patient
name record*
diagnosisdate
regular
bill
11
Our security model for XML
Security administrator: specifies a access-control policy for each group by extending the document DTD with XPath qualifiers
Derivation module: automatically derives a security-view definition from each policy: view DTD and mapping via XPath
Query translation module: rewrite and optimize queries over views to equivalent queries over the underlying document
XML document
specification 1
specification k
specification n
derivation module
Security view k(view DTD, xpath( ))
Security view n(view DTD, xpath( ))
Security view 1(view DTD, xpath( ))
query
Optimizer
Rewriter
query query
query translation module
12
Overcome the limitations of previous proposals
Specification and enforcement: at the conceptual (schema) level– no need to update the underlying XML data – no need to materialize views or perform runtime check
Schema availability: view schema is automatically derived– characterizing accessible data – exposing necessary schema information only
XML document
specification 1
specification k
specification n
derivation module
Security view k(view DTD, xpath( ))
Security view n(view DTD, xpath( ))
Security view 1(view DTD, xpath( ))
query
Optimizer
Rewriter
query query
query translation module
13
Access-control specification
DTD D : element type definitions A
::= PCDATA | | A1, …, Ak | A1 + … + Ak | A*
Specification S = (D, access( )): a mapping access( ) from the edges in the document DTD { Y, N, [q] }.
For each A , for each B in , define Access(A, B) as – Y: accessible (true)– N: inaccessible (false)– [q]: XPath qualifier, conditional: accessible iff [q] holds
XPath fragment:
p ::= | A | * | // | p/p | p p | p[q]
q ::= p | p = “c” | q1 q2 | q1 q2 | q
Access policy Document DTD
= + XPath qualifiers
14
Example: access policy S for syndrome surveillance
access(hospital, patient) = [//diagnose = “DIS”] -- [q1]
access(patient, record) = [diagnose = “DIS”] -- [q2]
access(treatment, trial) = N
access(treatment, regular) = N
access(regular, tname) = Y
conditionally accessible
overriding: if access(A, B) = Y (N), then the B children of A override the accessibility of A
inheritance: if access(A, B) is not explicitly defined, then the B children of A inherit the accessibility of A
content-based: conditional accessibility via XPath qualifiers
*
treatment
tname
*
trial
trName
hospital
SSN
patient
name record*
diagnosisdate
regular
bill
[q1]
[q2]
15
Properties of the specification language
XML tree of the document DTD: the accessibility of each data node is uniquely defined by an access specification– relative to the path from root– a qualifier at a node a
constrains the entire subtree rooted at a,
e.g., [q2] constrains tname various levels of granularity: entire
subtrees or specific elements schema level: the underlying XML
data is not touched; efficient, easy to specify and maintain
conditionally accessible
*
treatment
tname
*
trial
trName
hospital
SSN
patient
name record*
diagnosisdate
regular
bill
[q1]
[q2]
16
Enforce access control – security views
XML security view: = (Dv, xpath( )) with respect to an access policy S = (D, access( )),
Dv: view DTD, exposed to the user and characterizing the accessible information (of document DTD D) wrt S
Schema availability: to facilitate query formulation xpath( ): mapping from instances of D to instances of Dv
defined in terms of XPath queries and view DTD Dv
– for each A in Dv, for each B in , xpath(A, B) = p
– p: generates B children of an A element in a view
p ::= | A | * | // | p/p | p p | p[q]
q ::= p | p = “c” | q1 q2 | q1 q2 | q
17
Example: view DTD for syndrome surveillance
= (Dv, xpath( )) with respect to access policy S = (D, access( ))
*
treatment
tname*
hospital
SSN
patient
name record*
diagnosisdate
Document DTD D
View DTD Dv
Hide trial, trName, regular, bill Expose accessible information
only
*
treatment
tname
*
trial
trName
hospital
SSN
patient
name record*
diagnosisdate
regular
bill
[q1]
[q2]
18
Example: view definition for syndrome surveillance
xpath( ): maps edges in view DTD Dv to paths in document DTD D
hospital patient*
xpath(hospital,patient) = hospital/patient [q1][q1]: [//diagnose=“DIS”]
semantics:• top-down construction• preserving qualifiers in a
specification
patient patient patient patient
hospital
SSN recordname patient SSN, name, record*
xpath(patient, SSN) = SSN, /* name */
xpath(patient, record) = record [q2]
[q2]: [diagnose=“DIS”]
19
DTD-directed construction of security views
record date, diagnosis, treatment
xpath(record, date) = date
/* diagnosis, treatment */ patient patient patient patient
hospital
treatment tname*
xpath(treatment, tname) = //tname
DTD-directed construction
view DTD conformance Never materialized
the construction strategy is just to give the semantics
SSN recordname
date treatmentdiagnosis
tnametreatment
tname
*
trial
trName
regular
bill
tname
20
Derivation of security-view definition
XML security views are far more intriguing than relational views multiple XPath queries vs. a single SQL query DTD vs. relational schema
One needs an algorithm to compute a security-view definition: Input: an access policy S = (D, access( )) Output: a security-view definition = (Dv, xpath( ))
– sound: accessible information only
– complete: all the accessible data (structure preserving)
– DTD-conformant: conforming to the view DTD
efficient: O(|S|2) time generic: recursive/nondeterministic document DTDs
21
Algorithm: deriving a security-view definition
Top-down traversal of the document DTD D short-cutting/renaming (via dummy) inaccessible element types normalizing the view DTD Dv and reducing dummy types
*
hospital
patient*
hospital
patient [q1]
xpath(hospital,patient) = hospital/patient[q1]
SSN name record
*SSN name record
*[q2]
xpath(patient, record) = record[q2]
treatmentdiagnosisdatetreatment
diagnosisdate
xpath(record, treatment) = treatment
22
deriving a security-view definition
recursive and non-deterministic productions
xpath(treatment, dummy2) = regular xpath(treatment, dummy1) = trail
treatment
tname*
treatment
tname*
dummy1
trName
regular
bill
dummy2trial
reducing dummy element types:
(dummy1/treatment)* / dummy2 / tname dummy2/tname)
(dummy1/treatment)* / dummy2 / tname tname*
xpath(treatment, tname) = //tname
treatment
tname*
23
Enforce access control via query rewriting
security views are virtual: not materialized Efficiency: no extra costs to support multiple security views over
the same large document simultaneously Consistency/integrity: updating the underlying data introduces
no difficulties/overhead
XML document
Security view k(view DTD, xpath( ))
query
Optimizer
Rewriter
query translation module
Query translation: one needs an efficient algorithm to rewrite queries over a security view to equivalent and efficient queries over the underlying document
24
algorithm rewrite
Input: = (Dv, xpath( )) (security view wrt S = (D, access( ))),
and – an XPath query Qv over the view (Dv)
Output: an equivalent XPath query Qt over the document– for any XML document T of D, Qt(T) = Qv((T))
Dynamic programming:
for any subquery Qv’ of Qv, any node A in view-DTD graph Dv
rewrite Qv’ at A by incorporating xpath(A, _) Qt’ (A)
efficient: O(|Qv| | |2 ) time a practical class of XPath (with union, descendant, qualifiers)
vs. tree-pattern queries studied in previous security models
25
Example: query rewriting for syndrome surveillance
Qv = // patient[name=“Joe”] // tname over the view*
treatment
tname*
hospital
SSN
patient
name record*
diagnosisdate
xpath(hospital, patient) [name = “Joe”] /xpath(patient, record) /
xpath(record,treatment) /
xpath(treatment, tname)
*
treatment
tname*
trial
trName
hospital
SSN
patient
name record*
diagnosisdate
regular
bill
[q1]
[q2] Qt = /hospital/patient[name = “Joe” and //diagnosis = “DIS”] /record[diagnosis = “DIS”] /treatment // tname equivalent query over document
26
Query optimization with structural constraints
Optimize Qt = rewrite(, Qv) by leveraging the document DTD D
Q = A[B] // E[F] //H A [B and C] // H // F[G] / H
Q’ = A /B / E / F / H
A
B C
E
GF
H
DTD graph
disjunction: exclusive constraints
A [B and C] empty-set
exclusive constraint: an A element cannot have both B and C children at the same time
conjunction: existence (nonexistence)constraints
// F[G] / H empty-set
non-existence constraint: a F element does not have a G child
A[B] // E[F] // H A /B / E / F / H
exclusive constraint: B and C do not coexist under an A element
27
Example: heuristic for XPath containment
Q = // *[C] //E // E Q’ = A /B / E Q1 Q2 Q2 if Q1 Q2
// *[C] //E // E // E A /B / E
A
B
C E
DTD graph
*heuristic for XPath containment (NP-hard for
small fragments in the presence of DTDs) image graph: evaluation of sub-queries over
DTD graph containment test: extension of simulation
– Q1 Q2 if image(Q1) is simulated by image(Q2)
– qualifiers: inverse simulation effective: preliminary experimental study (speedup up to a factor of 2)
A
B
[C] E
image graphfor // *[C] //E
A
B
E
image graphfor // E
28
Summary
security views: the first model for specifying/enforcing XML
security at a schema level and providing schema availability – a fine-grained access-control specification language– an effective enforcement framework via security views
• view DTD: characterizing accessible information• algorithm for deriving security-view definitions
• algorithms for query rewriting/optimization: no need to
materialize views or to perform runtime security checks future work:
– reasoning about security views (soundness, completeness, DTD conformance – subsume XPath satisfiability with DTDs)
– inference control in the presence of external knowledge
A practical solution for securing XML querying