Keeping Chess Alive – Do we need 1-unambiguous content models? Murali Mani, UCLA/CSD Extreme...

Post on 16-Dec-2015

212 views 0 download

Transcript of Keeping Chess Alive – Do we need 1-unambiguous content models? Murali Mani, UCLA/CSD Extreme...

Keeping Chess Alive – Do we Keeping Chess Alive – Do we need 1-unambiguous content need 1-unambiguous content

models?models?Murali Mani, UCLA/CSD

Extreme Markup Languages 2001

Montreal, Canada

Outline of the talkOutline of the talk

Why is 1-unambiguity important?Formalize few concepts and learn –

– There exist regular languages that are inherently not 1-unambiguous.

We do not need 1-unambiguity– No additional benefit– Difficulty for document processing (type

inference)

Why is 1-unambiguity Why is 1-unambiguity important?important?

XML 1.0 specification

[3.2.1] – “It is an error if an element in the document can match more than one occurrence of an element type in the content model”.

[App E] – “The content model (b, c) | (b, d) is in error and may be reported as an error.”

Why is 1-unambiguity Why is 1-unambiguity important? (contd…)important? (contd…)

XML Schema [3.8.6] – Schema Component Constraint: Unique

Particle Attribution“A content model must be formed such that during validation of an element information item sequence, the particle with which we attempt to validate each item in the sequence can be uniquely determined without examining the content or attributes of that element, and without any information about items in the remainder of the sequence”

ConceptsConcepts

Regular expression – ‘,’, ‘|’, ,’*’

(a | b)*, c Model group – other operators also – ‘+’, ‘?’, ‘&’

a?, (b | c)* = (a, (b | c)*) | (b | c)*

Every regular expression is a model group Every model group can be expressed as a regular

expression.

1-unambiguous content 1-unambiguous content modelsmodels

Ambiguity in Graphs and Expressions – Book, Evan, Greibach, Ott, 1971– Given a regular expression, E, is E ambiguous?

For example, (a | (a, b*)) is ambiguous

Deterministic Regular Languages – Anne Bruggemann Klein, 1991– Studied 1-unambiguity in SGML content

models

1-unambiguous content 1-unambiguous content models (contd…)models (contd…)

Reasoning about XML Schema Languages using Formal Language Theory –Dongwon Lee, Murali Mani, Makoto Murata, 2000– Content models without the 1-unambiguous contraint

http://www.oasis-open.org/cover/topics.html#ambiguity

Example content model -- (whitemove, blackmove)*, whitemove?

Type assignmentType assignment

Type assignment (contd…)Type assignment (contd…)

Assumption – If the type of an element can be determined by a SAX parser on seeing the start element tag, it is sufficient.

DTDs and XML-Schema have the above property even without the 1-unambiguity constraint.

Disadvantages of having the Disadvantages of having the 1-unambiguity constraint1-unambiguity constraint

Significant loss in ability to describe constraints – the game of chess might be described as (whitemove | blackmove)*

We lost the following constraints– whitemove and blackmove alternate– We start with a whitemove

Shall we stick to the chess rules?

Disadvantages of having the Disadvantages of having the 1-unambiguity constraint 1-unambiguity constraint

(contd…)(contd…)Difficult for document processing, and type

inference– No characterization of 1-unambiguous model

groups– Less constraints => less algebraic optimization

is possible.

ConclusionsConclusions

One class of schema languages identified by the property – the type of an element can be determined by a depth first traversal (SAX parser) on seeing the start element tag

Such schema languages do not need the 1-unambiguity constraint.

1-unambiguity constraint is difficult to work with for type inference, and for playing chess.

AcknowledgementsAcknowledgements

XML-DEV mailing list – the discussions in this list largely motivated this talk.

Additional material at this Additional material at this conferenceconference

Taxonomy of XML Schema languages using Formal Language Theory – Aug 15, 4:00 pm

RELAX NG: Unification of RELAX Core and TREX – Aug 17, 9:00 am