XEM: Managing the Evolution of XML Documents

9
1 XEM: Managing the Evolution of XML Documents Author: Hong Su, Diane Kramer. Li Chen, Kajal Claypool and Elke A. Rundensteiner Presented by: Li Shuhong Fall 2001 CS401 Nov. 20, 2001

description

XEM: Managing the Evolution of XML Documents. Author: Hong Su, Diane Kramer. Li Chen, Kajal Claypool and Elke A. Rundensteiner Presented by: Li Shuhong Fall 2001 CS401 Nov. 20, 2001. Motivation:. - PowerPoint PPT Presentation

Transcript of XEM: Managing the Evolution of XML Documents

Page 1: XEM: Managing the Evolution of XML Documents

1

XEM: Managing the Evolution of XML Documents

Author: Hong Su, Diane Kramer. Li Chen, Kajal Claypool and Elke A. Rundensteiner

Presented by: Li Shuhong

Fall 2001 CS401

Nov. 20, 2001

Page 2: XEM: Managing the Evolution of XML Documents

2

Motivation:

• XML has become increasingly popular as the data exchange format over the Web. DTDs assume a similar role as types in programming languages and schemata in database systems

• Many systems utilize the given DTD to construct a fixed relational schema. This schema can serve as structure based on which to populate the XML documents that conform to this DTD

• Change is a fundamental aspect of persistent information and data-centric system. Most of the current XML management systems don’t provide enough support for these changes.

Page 3: XEM: Managing the Evolution of XML Documents

3

Motivating Example of XML Changes<!ELEMENT article (title, author+, related-work?)><!ELEMENT title (#PCDATA)<!ELEMENT author (name)>

<!ATTLIST author id ID #REQUIRED><!ELEMENT name (firstrname, lastname)><!ELEMENT firstname (#PCDATA)<!ELEMENT lastname (#PCDATA)<!ELEMENT related-work (monogragh)*><!ELEMENT monograph (title, editor)><!ELEMENT editor, EMPTY>

<!ATTLIST editor name CDATA #IMPLIED>

<article><title>XML Evolution Manager</title><author id = “dk”> <name>

<firstname>Diane</firstname> <lastname>Diane</lastname>

</name> </author>

<author id = “er”> <name>

<firstname>Elke</firstname> <lastname>Rudensteiner</lastname> </name></author><related-work> <monograph>

<title>Modern database systems</title> <editor name = “Won Kim”>

</monograph></related-work>

</article>

Page 4: XEM: Managing the Evolution of XML Documents

4

Example:

• Removal of <editor name = “Won Kim”>

• XML change support system would need to verify that:– 1. A new valid DTD.

– 2. Change all old XML documents to conform to the changed DTD

• Result of the example: this change leads to a DTD change, requiring no changes of the underlying XML data.

Problems with XML Management System:• To updates the code, the users must be aware of underlying storage

system and the mapping mechanism between XML, DTD, and their underlying storage model in order to prevent the errors of mismatch of desired XML transformation and the actual system change.

Page 5: XEM: Managing the Evolution of XML Documents

5

XML data Model & The DTD Data Model

• XML is composed of: nested tagged elements, attributes, and sub-elements. It may have an associated schema, and DTD.

• Document Type Definition(DTD) allows for properties or constraints to be defined on elements and attributes.

• A DTD can be modeled as graph, G= (N, p, l), where N is the set of nodes, p is the parent function representing the edges in the graph, and l is the labeling function representing a tuple of node’s properties.

Page 6: XEM: Managing the Evolution of XML Documents

6

Graph Representation of Article.dtd

Page 7: XEM: Managing the Evolution of XML Documents

7

Taxonomy and Semantics of XML Change Primitives

• Present the taxonomy of XML change primitives and define their semantics. Those primitives fall into two categories: – 1. pertaining to the DTD.

– 2. pertain to XML data.

• The primitives have the following characteristics:– Complete

– Minimal: each primitive is atomic

– Sound: consistency, integraty

• Example:– changeQuant(monograph, [1,2],”?”);

– destroyDataEl(“article/related-work/monograph[1]/editor”);

Page 8: XEM: Managing the Evolution of XML Documents

8

Page 9: XEM: Managing the Evolution of XML Documents

9

Advantages:

• Identify the lack of generic support for change in XML management systems.

• Provide a system to specify changes both at the DTD and XML data level

• Introduce the notion of constraint checking to ensure structural consistency.

• Express desired transformation independent of the underlying storage system.

• Describe a working XML Evolution Management prototype system: MARROW

Disadvantages:• This systems only allows simple pre-defined schema evolution operation