HTML+XML: The W3C HTML/XML Task Force · HTML+XML: The W3C HTML/XML Task Force Norman Walsh...

29
HTML+XML: The W3C HTML/XML Task Force Norman Walsh MarkLogic Corporation

Transcript of HTML+XML: The W3C HTML/XML Task Force · HTML+XML: The W3C HTML/XML Task Force Norman Walsh...

Page 1: HTML+XML: The W3C HTML/XML Task Force · HTML+XML: The W3C HTML/XML Task Force Norman Walsh MarkLogic Corporation

HTML+XML: The W3C HTML/XML TaskForce

Norman WalshMarkLogic Corporation

Page 2: HTML+XML: The W3C HTML/XML Task Force · HTML+XML: The W3C HTML/XML Task Force Norman Walsh MarkLogic Corporation

• History• Use Cases• Conclusions• Ask for your help

Slide 2 © 2010 MarkLogic Corporation

Agenda

Page 3: HTML+XML: The W3C HTML/XML Task Force · HTML+XML: The W3C HTML/XML Task Force Norman Walsh MarkLogic Corporation

Facts are facts. But any opinions expressed are the opinionsonly of myself and may or may not reflect the opinions ofanybody else with whom I may or may not have discussedthe issues at hand.In particular, I do not claim that what follows is necessarilythe consensus opinion of the HTML/XML Task Force.

Slide 3 © 2010 MarkLogic Corporation

Disclaimer

Page 4: HTML+XML: The W3C HTML/XML Task Force · HTML+XML: The W3C HTML/XML Task Force Norman Walsh MarkLogic Corporation

Slide 4 © 2010 MarkLogic Corporation

History (In the beginning…)

Page 5: HTML+XML: The W3C HTML/XML Task Force · HTML+XML: The W3C HTML/XML Task Force Norman Walsh MarkLogic Corporation

Slide 5 © 2010 MarkLogic Corporation

History (HTML)

Page 6: HTML+XML: The W3C HTML/XML Task Force · HTML+XML: The W3C HTML/XML Task Force Norman Walsh MarkLogic Corporation

Slide 6 © 2010 MarkLogic Corporation

History (XML)

Page 7: HTML+XML: The W3C HTML/XML Task Force · HTML+XML: The W3C HTML/XML Task Force Norman Walsh MarkLogic Corporation

Slide 7 © 2010 MarkLogic Corporation

History (XHTML)

Page 8: HTML+XML: The W3C HTML/XML Task Force · HTML+XML: The W3C HTML/XML Task Force Norman Walsh MarkLogic Corporation

Slide 8 © 2010 MarkLogic Corporation

History (Alternative history 1)

Page 9: HTML+XML: The W3C HTML/XML Task Force · HTML+XML: The W3C HTML/XML Task Force Norman Walsh MarkLogic Corporation

Slide 9 © 2010 MarkLogic Corporation

History (Bzzzt! No! …)

Page 10: HTML+XML: The W3C HTML/XML Task Force · HTML+XML: The W3C HTML/XML Task Force Norman Walsh MarkLogic Corporation

Slide 10 © 2010 MarkLogic Corporation

History (Alternative history 2)

Page 11: HTML+XML: The W3C HTML/XML Task Force · HTML+XML: The W3C HTML/XML Task Force Norman Walsh MarkLogic Corporation

Slide 11 © 2010 MarkLogic Corporation

History (Bzzzt! No! …)

Page 12: HTML+XML: The W3C HTML/XML Task Force · HTML+XML: The W3C HTML/XML Task Force Norman Walsh MarkLogic Corporation

Slide 12 © 2010 MarkLogic Corporation

Present day

Page 13: HTML+XML: The W3C HTML/XML Task Force · HTML+XML: The W3C HTML/XML Task Force Norman Walsh MarkLogic Corporation

Perhaps the biggest challenge that faces theW3C'stechnical work on the Web is the growing chasmbetween HTML and XML

—T. V. Raman

• TAG Issue-67: HTML and XML Divergence>Tag soup>Namespaces>Syntactic differences (quoted attribute values)>DOM differences (tbody insertion)>Distributed extensibility

Slide 13 © 2010 MarkLogic Corporation

What's wrong?

Page 14: HTML+XML: The W3C HTML/XML Task Force · HTML+XML: The W3C HTML/XML Task Force Norman Walsh MarkLogic Corporation

(Using the HTML5 Live DOM Viewer)

Slide 14 © 2010 MarkLogic Corporation

DOM differences (markup)

Page 15: HTML+XML: The W3C HTML/XML Task Force · HTML+XML: The W3C HTML/XML Task Force Norman Walsh MarkLogic Corporation

Slide 15 © 2010 MarkLogic Corporation

DOM differences (DOM)

Page 16: HTML+XML: The W3C HTML/XML Task Force · HTML+XML: The W3C HTML/XML Task Force Norman Walsh MarkLogic Corporation

• In practice>SVG in W3C and HTML WG>RDFa in W3C not in HTML WG>FBML not in W3C

• In practice>Namespaces

• In principle

Slide 16 © 2010 MarkLogic Corporation

Distributed extensibility

Page 17: HTML+XML: The W3C HTML/XML Task Force · HTML+XML: The W3C HTML/XML Task Force Norman Walsh MarkLogic Corporation

• Anne van Kesteren• Robin Berjon• Michael Champion • Yves Lafon (staff contact)

• Noah Mendelsohn• James Clark• John Cowan • Henri Sivonen

•• Norman Walsh (chair)Michael Kay

Slide 17 © 2010 MarkLogic Corporation

W3C TAG HTML/XML Task Force

Page 18: HTML+XML: The W3C HTML/XML Task Force · HTML+XML: The W3C HTML/XML Task Force Norman Walsh MarkLogic Corporation

How can an XML toolchain be used to consume HTML?• Author HTML5 with polyglot markup.• Add an HTML5 parser to the front end of your toolchain.>Doesn't solve the pernicious “document.write” problem orother script-related problems.

>But short of running a JavaScript engine on the content,nothing is likely to solve those problems.

Slide 18 © 2010 MarkLogic Corporation

Use Case: Consume HTML

Page 19: HTML+XML: The W3C HTML/XML Task Force · HTML+XML: The W3C HTML/XML Task Force Norman Walsh MarkLogic Corporation

How can an HTML5 toolchain be used to consume XML?• HTML5 tools won't be designed to deal with arbitrary ele-ment names in arbitrary namespaces.

• Transforming to HTML5 is probably the best route.>Even a partial transformation to remove namespaces, PIs, etc.might prove valuable.

• It's probably best not to encourage users to imagine thiswill be broadly successful.

Slide 19 © 2010 MarkLogic Corporation

Use Case: Consume XML

Page 20: HTML+XML: The W3C HTML/XML Task Force · HTML+XML: The W3C HTML/XML Task Force Norman Walsh MarkLogic Corporation

How can islands of HTML be embedded in XML?• Use the XML serialization of HTML5.• Escape the markup.• Rely on more sophisticated multipart-message handlingsystems.

Regardless, some caremay be necessary. How are the HTMLislands going to be processed? By “clipping” them out andprocessing them with an HTML5 tool, or by passing thewhole DOM to the tool?

Slide 20 © 2010 MarkLogic Corporation

Use Case: Embed HTML

Page 21: HTML+XML: The W3C HTML/XML Task Force · HTML+XML: The W3C HTML/XML Task Force Norman Walsh MarkLogic Corporation

How can islands of XML be embedded in HTML?• The HTML5 parser interprets unfamiliar markup as an errorand corrects for it.

• Correction can include changing the order and nesting ofelements.

• Practically speaking: you can't embed a “naked” island ofXML in HTML5.

Slide 21 © 2010 MarkLogic Corporation

Use Case: Embed XML

Page 22: HTML+XML: The W3C HTML/XML Task Force · HTML+XML: The W3C HTML/XML Task Force Norman Walsh MarkLogic Corporation

Putting clothes on your XML<script type="application/xml"> <data> <title>Your XML</title> <gpx xmlns="http://www.topografix.com/GPX/1/1"> <wpt lat="50.077484" lon="14.443800"> <ele>200.08</ele> <time>2007-01-06T17:33:04Z</time> <name>001</name> <sym>Restaurant</sym> </wpt> </gpx> </data></script>

Slide 22 © 2010 MarkLogic Corporation

Use Case: Embed XML

Page 23: HTML+XML: The W3C HTML/XML Task Force · HTML+XML: The W3C HTML/XML Task Force Norman Walsh MarkLogic Corporation

Belt and suspenders<script type="application/xml"> &lt;data&gt; &lt;title&gt;Your XML&lt;/title&gt; &lt;gpx xmlns="http://www.topografix.com/GPX/1/1"&gt; &lt;wpt lat="50.077484" lon="14.443800"&gt; &lt;ele&gt;200.08&lt;/ele&gt; &lt;time&gt;2007-01-06T17:33:04Z&lt;/time&gt; &lt;name&gt;001&lt;/name&gt; &lt;sym&gt;Restaurant&lt;/sym&gt; &lt;/wpt&gt; &lt;/gpx&gt; &lt;/data&gt;</script>

Slide 23 © 2010 MarkLogic Corporation

Use Case: Embed XML

Page 24: HTML+XML: The W3C HTML/XML Task Force · HTML+XML: The W3C HTML/XML Task Force Norman Walsh MarkLogic Corporation

Use the HTML5 extensibility mechanisms<div class="data"> <h1 class="title">Your XML</h1> <div class="gpx"> <div class="wpt" data-lat="50.077484" data-lon="14.443800"> <span class="ele">200.08</span> <span class="time">2007-01-06T17:33:04Z</span> <span class="name">001</span> <span class="sym">Restaurant</span> </div> </div></div>

Slide 24 © 2010 MarkLogic Corporation

Use Case: Embed XML

Page 25: HTML+XML: The W3C HTML/XML Task Force · HTML+XML: The W3C HTML/XML Task Force Norman Walsh MarkLogic Corporation

How can XML be made easier to use?• Guaranteeing that you'll get well-formed XML out of naiveattempts to generate it with “print” statements is tricky.

• Rules could be devised for providing some degree ofmarkup minimization/error correction in XML.

• It's possible to consider other simplifications as well, fornamespaces, for example.

Slide 25 © 2010 MarkLogic Corporation

Use Case: a more forgiving XML

Page 26: HTML+XML: The W3C HTML/XML Task Force · HTML+XML: The W3C HTML/XML Task Force Norman Walsh MarkLogic Corporation

• Is XForms a use case or a specific solution to the use caseof better form controls?

• Is XFroms different in some substantial way than the gen-eral “embedding XML in HTML” use case?

Slide 26 © 2010 MarkLogic Corporation

Use Case: XForms?

Page 27: HTML+XML: The W3C HTML/XML Task Force · HTML+XML: The W3C HTML/XML Task Force Norman Walsh MarkLogic Corporation

• Review the Task Force Report• Talk to the communities you know about the use casesthey have.

• Report use cases that you think are not met.

Slide 27 © 2010 MarkLogic Corporation

What you can do

Page 28: HTML+XML: The W3C HTML/XML Task Force · HTML+XML: The W3C HTML/XML Task Force Norman Walsh MarkLogic Corporation

Slide 28 © 2010 MarkLogic Corporation

Future history?

Page 29: HTML+XML: The W3C HTML/XML Task Force · HTML+XML: The W3C HTML/XML Task Force Norman Walsh MarkLogic Corporation

• Join the mailing list>http://lists.w3.org/Archives/Public/public-html-xml/

• Read the draft report>http://www.w3.org/2010/html-xml/snapshot/report.html

Slide 29 © 2010 MarkLogic Corporation

Thank you!