QURSED : Querying and Reporting Semistructured Data Yannis Papakonstantinou, Michalis Petropoulos,...

21
QURSED : Querying and Repor ting Semistructured Data Yannis Papakonstantinou, Michalis Petropo ulos, and Vasilis Vassalos Data Warehousing Lab. Semester 2 Youlim Park

Transcript of QURSED : Querying and Reporting Semistructured Data Yannis Papakonstantinou, Michalis Petropoulos,...

Page 1: QURSED : Querying and Reporting Semistructured Data Yannis Papakonstantinou, Michalis Petropoulos, and Vasilis Vassalos Data Warehousing Lab. Semester.

QURSED : Querying and Reporting Semistructured Data

Yannis Papakonstantinou, Michalis Petropoulos, and Vasilis Vassalos

Data Warehousing Lab.

Semester 2 Youlim Park

Page 2: QURSED : Querying and Reporting Semistructured Data Yannis Papakonstantinou, Michalis Petropoulos, and Vasilis Vassalos Data Warehousing Lab. Semester.

LaboratoryData Warehousing

Overview

Introduction Related Work Motivating Example The QURSED system

System Architecture Tree Query Language Query Set Specification QURSED Editor

Conclusion

Page 3: QURSED : Querying and Reporting Semistructured Data Yannis Papakonstantinou, Michalis Petropoulos, and Vasilis Vassalos Data Warehousing Lab. Semester.

LaboratoryData Warehousing

Introduction - Purpose

Purpose of this paper To show how to automatically construct the web-bas

ed query forms and reports for querying semistructured XML data.

QURSED Produces query form and report pages (QFRs) that are as

sociated with a query set specification. The produced queries are expressed in XQuery. The query results are expressed directly in HTML, for perfor

mance reasons.

Page 4: QURSED : Querying and Reporting Semistructured Data Yannis Papakonstantinou, Michalis Petropoulos, and Vasilis Vassalos Data Warehousing Lab. Semester.

LaboratoryData Warehousing

Introduction - Contributions

Contributions of this paper Forms and Reports for Semistructured Data.

Generates form and report pages that target the needs of interacting with and presenting semistructured data.

Loose Coupling of Query and Visual Aspects. Makes easier to develop and maintain the resulting form

and report pages. Powerful and Succinct Query Set Specification.

Provides formal syntax and semantics for the query set specification by using the Tree Query Language

Page 5: QURSED : Querying and Reporting Semistructured Data Yannis Papakonstantinou, Michalis Petropoulos, and Vasilis Vassalos Data Warehousing Lab. Semester.

LaboratoryData Warehousing

Related Work

Web-based Form and Report Generators Creates form and report pages that access re

lational databases. Not focused on semistructured data. Ex) Macromedia DreamWeaver Ultradev and

ColdFusion, and Microsoft Visual Interdev

Page 6: QURSED : Querying and Reporting Semistructured Data Yannis Papakonstantinou, Michalis Petropoulos, and Vasilis Vassalos Data Warehousing Lab. Semester.

LaboratoryData Warehousing

The QURSED System - System Architecture

QURSEDRun Time Engine

QURSEDCompiler

XQueryEngine

QURSEDEditor

(Optional)Custom

Predicates

Query FormPage

XQueryExpressions

XML/XHTML

QueryForm Page

ReportPages

APP SERVER

BROWSER

XHTMLQuery Form

Page

(Optional)XHTML

TemplateReport Page

Query SetSpecification

Query/VisualAssociation

DynamicServer Pages

WYSIWYGXHTMLEditor

Deployment

XMLSchema

Developer

End-User

Web Designer

Page 7: QURSED : Querying and Reporting Semistructured Data Yannis Papakonstantinou, Michalis Petropoulos, and Vasilis Vassalos Data Warehousing Lab. Semester.

LaboratoryData Warehousing

Motivating Example - A catalog for sensors

Example XML Schema

Example Data Model

Page 8: QURSED : Querying and Reporting Semistructured Data Yannis Papakonstantinou, Michalis Petropoulos, and Vasilis Vassalos Data Warehousing Lab. Semester.

LaboratoryData Warehousing

Motivating Example - A catalog for sensors

Query Form Page

Result PageExample QFR

Interface

Page 9: QURSED : Querying and Reporting Semistructured Data Yannis Papakonstantinou, Michalis Petropoulos, and Vasilis Vassalos Data Warehousing Lab. Semester.

LaboratoryData Warehousing

The QURSED System - Tree Query Language

Condition Tree

Result Tree

Conditions

Root Element

Name Variable

Element Variable

Page 10: QURSED : Querying and Reporting Semistructured Data Yannis Papakonstantinou, Michalis Petropoulos, and Vasilis Vassalos Data Warehousing Lab. Semester.

LaboratoryData Warehousing

Tree Query Language (TQL)sensors

manufacturername

productpart_number

“Turck”

“A123”specssensing_distance“11”

body_typecylindrical

barrel_style

diameter“17”

“Smooth”protection_ratingsprotection_rating“NEMA1”

protection_rating“NEMA3”

manufacturername

productpart_number

“Turck”

“B123”specssensing_distance“25”

body_typerectangular

width

height“10”

“30”protection_ratingsprotection_rating“NEMA3”

protection_rating“NEMA4”

$NAME $PROD $PART $PROT1 Turck product

part_number“A123”

...

A123 NEMA1

$NAME $PROD $PART $PROT1 Turck product

part_number“A123”

...

A123 NEMA3

$NAME $PROD $PART $PROT1 Turck product

part_number“B123”

...

B123 NEMA3

sensors

manufacturer

product

specs

name

protection_ratings

protection_rating

$PROD

$PROT1

$NAME

$S

ANDPROT1 = “NEMA3” ORPROT1 = “NEMA1”

ANDPROT1 = “NEMA3” ORPROT1 = “NEMA1”

Page 11: QURSED : Querying and Reporting Semistructured Data Yannis Papakonstantinou, Michalis Petropoulos, and Vasilis Vassalos Data Warehousing Lab. Semester.

LaboratoryData Warehousing

Tree Query Language (TQL)sensors

manufacturername

productpart_number

“Turck”

“A123”specssensing_distance“11”

body_typecylindrical

barrel_style

diameter“17”

“Smooth”protection_ratingsprotection_rating“NEMA1”

protection_rating“NEMA3”

manufacturername

productpart_number

“Turck”

“B123”specssensing_distance“25”

body_typerectangular

width

height“10”

“30”protection_ratingsprotection_rating“NEMA3”

protection_rating“NEMA4”

$NAME $PROD $PART $BODY $CYL $DIA $BAR Turck product

part_number“A123”

...

A123 cylindrical cylindricaldiameter“17”

...

17 Smooth

$NAME $PROD $PART $BODY $REC $HEI $WID Turck product

part_number“B123”

...

B123 rectangular rectangularheight“10”

...

10 30

sensorsmanufacturer

product

body_type

cylindricaldiameter

specs

name

protection_ratingsprotection_rating

rectangular

widthheight

barrel_style

$PROD

$DIA

$HEI$WID

$NAME

$REC

$CYL

$S

$BAR

OR

AND

AND

AND

$BODY

$HEI <= 20 AND$WID <= 40

$DIA <= 20 AND$DIA <= 40

OR

AND$DIA <= 20 AND$DIA <= 40

OR

AND$HEI <= 20 AND$WID <= 40

AND$DIA <= 20 AND$DIA <= 40

Page 12: QURSED : Querying and Reporting Semistructured Data Yannis Papakonstantinou, Michalis Petropoulos, and Vasilis Vassalos Data Warehousing Lab. Semester.

LaboratoryData Warehousing

TQL Semantics

Conjunctive Condition Trees OR-Removal Algorithm Transformation Rules

AOR

D

AND

BC

ANDA

OR

BD

ANDACD

AOR

OR

BC

AOR

BC

AOR

D

element node

BC

ANDA

OR

BD

ANDACD

element node

ANDOR

AND

element node

ANDelement node

OR

ANDelement node

replace with replace with replace withreplace with

Condition Tree

Page 13: QURSED : Querying and Reporting Semistructured Data Yannis Papakonstantinou, Michalis Petropoulos, and Vasilis Vassalos Data Warehousing Lab. Semester.

LaboratoryData Warehousing

TQL Semantics

trtdTurck

tdA123

“Cylindrical”

17

table

td11

tdtabletrtd

tdtr

tr

tdTurck

tdB123

td25

td

“Rectangular”

30

10

tabletrtd

td

td

tr

td

Smoothtd

htmlbody

tr

$NAME

“Cylindrical”

$DIA

html

GROUPBY ($PROD, $NAME)SORTBY ($NAME DESC)

td

tabletrtd

tdtr

“Rectangular”

$WID

$HEI

tabletrtd

td

td

tr

td

GROUPBY ($DIA)

GROUPBY ($HEI)

GROUPBY ($WID)

GROUPBY ($CYL)

GROUPBY ($REC)

$BARtd

GROUPBY ($BAR)

bodytable

Result Tree

Page 14: QURSED : Querying and Reporting Semistructured Data Yannis Papakonstantinou, Michalis Petropoulos, and Vasilis Vassalos Data Warehousing Lab. Semester.

LaboratoryData Warehousing

Query Set Specification (QSS)

f1 f2 f3

$DIST <= $#DIST

$PROT1 = $#PROT1

$DIA <= $#DIMX AND$DIA <= $#DIMY

$HEI <= $#DIMX AND$WID <= $#DIMY

sensors

manufacturer

product

sensing_distance

body_type

cylindrical

diameter

AND

specs

name

protection_ratings

$PROD

OR

AND

rectangular

width

height

AND

$DIST

$DIA

$HEI

$WID

$NAME

protection_rating $PROT1

$CYL

$REC

barrel_style $BAR

$S

Condition Tree Generator

•Parameterized boolean expressions

•Multiple boolean expressions per AND node

•Condition fragments

Page 15: QURSED : Querying and Reporting Semistructured Data Yannis Papakonstantinou, Michalis Petropoulos, and Vasilis Vassalos Data Warehousing Lab. Semester.

LaboratoryData Warehousing

Run-time: QSS to TQL Queries

f1

$NAME

$NAME = $#NAME

f2

$PROD

$PART

$DIST

sensors

manufacturer

product

sensing_distance

AND

specs

name

part_number

$DIST <= $#DIST

$NAME = “Turck”

$DIST <= 6

$NAME = “Turck”

Page 16: QURSED : Querying and Reporting Semistructured Data Yannis Papakonstantinou, Michalis Petropoulos, and Vasilis Vassalos Data Warehousing Lab. Semester.

LaboratoryData Warehousing

The QURSED System - QURSED Editor

XML Schema

Query Form Page

HTML Editor

Page 17: QURSED : Querying and Reporting Semistructured Data Yannis Papakonstantinou, Michalis Petropoulos, and Vasilis Vassalos Data Warehousing Lab. Semester.

LaboratoryData Warehousing

The QURSED System - QURSED Editor

Building Condition Tree Generators By defining a set of condition fragments

driven by the input schema Action 1 : Create Condition Fragment.

Must provide a unique ID. Action 2 : Build Boolean Expression.

The chosen predicate is initially unspecified. Action 3 : Specify Elements as Operands.

Sets the left operand from the schema. Action 4 : Bind Form Controls to Operands.

Binds the right operand to an HTML form control.

Page 18: QURSED : Querying and Reporting Semistructured Data Yannis Papakonstantinou, Michalis Petropoulos, and Vasilis Vassalos Data Warehousing Lab. Semester.

LaboratoryData Warehousing

The QURSED System - QURSED Editor

Building Result Trees By specifying which element nodes of the schema

the developer wants to present on the report page. Then, the editor automatically builds a result tree

that creates report pages presenting the source data in the form of HTML tables that are nested according to the nesting present in the source schema.

Page 19: QURSED : Querying and Reporting Semistructured Data Yannis Papakonstantinou, Michalis Petropoulos, and Vasilis Vassalos Data Warehousing Lab. Semester.

LaboratoryData Warehousing

The QURSED System - QURSED Editor

Building Schema-Driven Result Tree Action 1 : Select Element Nodes.

This action builds the result fragment indicated in the condition tree generator.

Action 2 : Build the Template Report Page.

Also automatically generates the element mappings and the group-by mappings when it automatically generates the template report page.

Page 20: QURSED : Querying and Reporting Semistructured Data Yannis Papakonstantinou, Michalis Petropoulos, and Vasilis Vassalos Data Warehousing Lab. Semester.

LaboratoryData Warehousing

The QURSED System - QURSED Editor

XML Schema

Result Page

Rendered view

Page 21: QURSED : Querying and Reporting Semistructured Data Yannis Papakonstantinou, Michalis Petropoulos, and Vasilis Vassalos Data Warehousing Lab. Semester.

LaboratoryData Warehousing

Conclusion

Presented QURSED, a system for the generation of web-based interfaces for querying and reporting semistructured data. Tree Query Language for representing queries Query Set Specification for encoding the large sets of quer

ies QURSED Editor

Future Work The editor will provide a HTML view of the right-hand side

panels. Will increase the power of the query set specification, and

extend the set of queries that can be expressed with TQL.