Query Languages
description
Transcript of Query Languages
Query Languages
Aswin Yedlapalli
XML Query data model
• Document is viewed as a labeled tree with nodes
• Successors of node may be : - an ordered sequence of nodes (eg.
for sub elements).- an unordered set of nodes (eg. For attributes).
• Compatible with XML schemas
Comparison of XML and semi structured data
• Similarities:- both are best described by a labeled
graph.- both are schema-less self describing.
• Differences:- XML is ordered; semi structured
data is unordered.- XML can mix text and elements
Required features for a Query Language
• Expressive power- The Query language must be at least as
expressive as SQL on relational data.- The Query language should have the
ability to restructure data.- The Query language should be able to
navigate data with arbitrary nesting.• Semantics
- It is very important in a query language for query transformation and optimization.
• Compositionality
- Our queries must remain in the same data model. They cannot take data in one model and produce output in another model.
• Schema
- when structure is defined, a query language should be exploited for optimization, type checking etc.,
Query languages
• For semi structured data- Lorel (Lightweight Object REpository Language)- UnQL (Unstructured Query Language)-StruQL, MSL, W3QL, WebSQL, Weblog, etc.,
• For XML- XML-QL (XML Query Language)- XSLT & structural recursion.- XML Query Algebra.
Formal Semantics
• Given query Q = SELECT E[X1,……. Xn] FROM F WHERE Cand database DB
Answer: (Q,DB) is defined in two steps:
–Step 1: compute all bindings:
•Cij are node oids or atomic values
•Must satisfy paths in F
•Must satisfy conditions in C
–Step 2: answer is E[C11, …, C1n] …
E[Cm1, …, Cmn]
• When E has nested sub queries, apply semantics recursively
• Note: so far we have dealt with an unordered model
• –What do we need to do for order ?
• •Complexity: PTIME in |DB| (not in |Q|).
LOREL
• Minor syntactic differences in regular pathexpressions (% instead of _, # instead of _*)
• Common path convention SELECT biblio.book.author
FROM biblio.bookWHERE biblio.book.year = 1999
Becomes
SELECT X.author
FROM biblio.book X
WHERE X.year = 1999
Lorel
• Query language of LORE system adapts OQL to semi structured data.
Select X.title
from bib.article X
where “tova milo” in X.author
returns {title: “type inf…”}
Features of Lorel
• Differences with typed query languages- performs implicit coercions.- deals with missing attributes.- deals with set valued attributes.
eg., x.year > 1998 may have several years.
• Select clause creates new nodes.• Allows for nested queries.• Allows for regular path expressions.
UnQL (Unstructured Query language)
• UnQL is an extension of basic LOREL.• UnQL does not make use of coercion unlike
LOREL.• “Where” clause contains 2 kinds of constructs.
- generators; variables are bound via patterns.
- conditions; as in LOREL• “from” clause is not needed as variables are bound
in patterns.
UnQL Queries
• Eg., Select title:Twhere {bib:article:{title:T, year:Y}}in db,
y>1998.• Root of the database is explicitly represented: db• UnQL queries can be rewritten in LOREL.
The equivalent LOREL for the above query is: select title:Tfrom bib.article A, A.title T, A.year Y
where Y>1998.
Additional features of LOREL
• Label variables
- can combine “schema” and “data” information.
- can turn tables to data and vice-versa.
- perform group-by operations.
• Can match variables with regular expressions.
References
• Managing XML and semi structured data – Lecture series by Prof. Dan Suciu.
• website:www.cs.washington.edu/homes/suciu/COURSES/590DS/