1 Standards In A Digital World: Z39.50, HTML, Java: Do They Really Work? Brian Kelly UK Web Focus...
-
Upload
cynthia-anderson -
Category
Documents
-
view
224 -
download
1
Transcript of 1 Standards In A Digital World: Z39.50, HTML, Java: Do They Really Work? Brian Kelly UK Web Focus...
1
Standards In A Digital World:Z39.50, HTML, Java:
Do They Really Work?
Brian Kelly
UK Web Focus
UKOLN
University of Bath
http://www.ukoln.ac.uk/
2
Contents• Introduction• HTML
• Initial Roadmap / The Diversion / Back on Course
• W3C Standardisation Process• Rivals to HTML
• PDF• Viewers
• Scripting• Client-side Scripting Languages• Server side Scripting
• Distributed Searching• Z39.50• Other Protocols
• Conclusions
3
UK Web FocusUK Web Focus:
• National web coordination post for UK HE community
• Based at UKOLN, University of Bath
• Responsibilities include:– Technology watch– Information dissemination in variety of ways:
– Workshops (national, regional)– Presentations at conferences and seminars– Online
– Coordination activities– Representing JISC on W3C
• Brian Kelly appointed on 1st November 1996– Involved with web since January 1993– Previously worked at University of Newcastle, Leeds,
Liverpool, and Loughborough
4
The QuestionWhere do you stand?
The success of the Web is based on competition
in the marketplace.
Just look at the benefits provided by competition between Netscape and
Microsoft.
The success of the Web is based on building on open, non-proprietary
standards.
Use of proprietary systemshas increased costs for
the user, and resulted in flawed systems.
5
HTML Roadmap
HTML 1.0 Gets things started
HTML 2.0 CERN / NCSA partnership introduces NCSA Mosaic with support for forms and inline images
HTML + Proposal for enhancements including improved layout control (e.g. tables), maths, etc.
Style Sheets Mechanism for defining appearance
Structure separate from appearanceVarious proposals (DSSSL, CSS, …)
6
HTML History
HTML 1.0 Unpublished specification. DTD developed by Tim Berners-Lee (CERN).
HTML 2.0 Spec. based on innovations from NCSA (forms and inline images!)
HTML 3.0 Proposed spec. (renamed from HTML+).Very comprehensive Failed to complete IETF standardisation processLittle implementation experience
HTML 3.2 Spec. based on description of mainstream innovations in marketplace
HTML 4.0 Current proposal.
7
HTML Wars
October 1994 Netscape released (Mosaic Communication Corporation)Quality browser, but supported proprietary tags (<BLINK>, <FONT>, etc.)
1995 New versions of Netscape released, supporting additional proprietary tags (<SPACER>, <LAYER>, etc.)
1996 Microsoft respond to competition with their own proprietary tags (<MARQUEE>, etc)
8
HTML Wars - The ProblemsDevice Dependency
• Resources are dependent on a particular browser• Platform dependency
Costs• Costs in supporting authoring tool• Potential costs in re-engineering
Architecture• Proprietary innovations have been flawed:
– Merging content and appearance– Maintenance of resources
• Accessibility problems:– Poor support for access by disabled (e.g. speaking
browsers for visually impaired)
9
End of the Wars?
Microsoft Pledge on HTML Standards "HTML is the most basic and fundamental data format of the Web.
Support for HTML standards ensures that content can be viewed by any browser as the creator intended.
…. agreement on the most basic data format is critical to interoperability and the continued growth of the industry."
Thursday, August 21 1996
See http://www.microsoft.com/internet/html.htm
10
Microsoft Pledge (Cont.)"Previous proprietary HTML extensions from Microsoft and other vendors have confused the market, hampered interoperability and been ill-conceived with respect to [HTML] design principles ...
Microsoft will agree to: Not ship extensions to HTML without first submitting them to
W3C. Implement all W3C approved HTML standards. Clearly identify any not-yet-approved HTML tags we support as
such. Publish a Document Type Definition (DTD) for its browser as
mandated by SGML. Follow the architecture principles of HTML and its parent,
SGML, when proposing new extensions.
Microsoft agrees to hold itself to these standards. Will all the other Web browser vendors, including Netscape, also agree to this conduct of behavior?"
11
HTML 4.0 and CSSHTML 4.0 and CSS will provide an architecturally pure, yet functionally rich environment
HTML 4.0• Improved forms• Hooks for stylesheets• Hooks for scripting
languages• Table enhancements• Better printing
CSS• Support for all HTML
formatting • Positioning of HTML
elements• Support for multiple
media
ProblemsSome problems with CSS are being experienced following:
• Use of CSS features which changed during CSS development
• Browser supported features which changed
ProblemsSome problems with CSS are being experienced following:
• Use of CSS features which changed during CSS development
• Browser supported features which changed
12
W3C ProcessW3C:
• A consortium of subscribing member organisations• Areas of work agreed by
members• Working group set up:
– Charter– WG membership (restricted)
• Initial recommendationsproduced by WG
• Recommendation made public• Feedback on open mailing lists and to editor• Recommendation updated• Members vote
User Interface:• HTML• Style Sheets• Document Object Model• Maths• Graphics• Fonts
User Interface:• HTML• Style Sheets• Document Object Model• Maths• Graphics• Fonts
13
W3C Process
Pros• Work can be well-
focussed• Avoids "flaming"• Battle can take place
in private• Implementation and
development of spec closely linked
Cons• Discussions are closed• Process undemocratic• Only rich companies
can afford to take part• Difficult for non-
members to contribute their expertise
• Non-members may be developing systems in isolation
14
HTML - The Competition
What are the alternatives to HTML ?HTML An SGML DTD
Describes document structureUsed in conjunction with emerging style sheet proposalAgreements on standards emerging
PDF Adobe's Portable Document FormatProvides control over appearanceProprietary
Native file formatStore document in native format, and provide user with reader on client machine
SGML / XMLRicher DTDs
15
PDF Pros• Control over appearance not (yet) easily available in
HTML• Functionality of PDF Reader can controlled (e.g.
prevent copying, printing with watermarks)
PDF Cons• Does not store document structure• Proprietary
– How would we feel about it if it where owned by Microsoft?
– Remember GIF patent problems!• Printing problems
16
Use of Native File FormatFiles can be stored in their native file format (Word, Powerpoint, LaTeX, DVI, etc.)
Files may then be viewed using the application or a viewer which understands the format
Pros:• No conversion needed
Cons:• Viewing software needed• Format version issues• Indexing issues• Viruses• Proprietary
17
XMLXML:
• Extensible Markup Language• A lightweight SGML designed for network use• Arbitrary elements can be defined (<STUDENT-NUMBER>, <PART-NO>, etc)
• Eliminates problems encountered in extending HTML:– Extension by fiat e.g. <FONT>– Public experiments e.g. the <BLINK> tag– The standards process e.g. Maths
• Agreement achieved quickly• Support from industry (SGML vendors, Microsoft, etc.)
18
XML Support
Microsoft have expressed support for XML:"Internet Explorer version 4.0 will support a few XML applications (such as CDF). Microsoft will be supporting XML in future versions of Internet Explorer" See http://www.microsoft.com/standards/xml-f.htm
Note how they will be supporting an ISO standard!
19
MetadataMetadata - the missing architectural component from the initial implementation of the web
AddressingURL
Data formatHTML
TransportHTTPMetadata
PICS, TCN,
MCF, DSig,
DC,...
20
Metadata Requirements
Imagine a university prospectus on the web
Requirement Protocol
Available in Middle East,where porn filters in use
PICS (rating system)
Resource discovery (find“Bath prospectus”)
DubIin Core
Legally binding assertion Digital Signature(DSig)
Delivered in appropriateformat (HTML, PDF)
Transparent ContentNegotiation
21
Metadata Standards
PICS Agreement within industry (US Communications Decency Act perceived as threat)Format moving to XML in PICS/NG
Dublin Core Pressure from library community results in changes to HTML 4Format likely to move to XML
Digital SignaturesBased on PICS/NG
W3C to set up a Metadata Coordination Group
22
Other XML DevelopmentsXML seems to be gaining momentum:PICS Moving from rating system to key part of
metadata architecture
CDF Channel Definition FormatMicrosoft proposal for push technology
OPS Open Profiling SpecificationMicrosoft proposal
XML Web CollectionsMicrosoft proposal for defining relationships between resource.
MCF using XMLNetscape proposal for describing metadata for collections of resources using XML
CML Chemical Markup Language
MML Math Markup Language
23
Scripting
Background:• Netscape's Javascript (renamed from
Livescript) was first widely-deployed scripting language
• Problems with inter-working between different versions
• Problems with inter-working across browsers (Microsoft and Jscript)
• Problems with use of multiple scripting languages in a document
24
Scripting
Developments:• Javascript handed to standards body (ECMA)
See http://www.ecma.ch/memento/tc39.htm• W3C developing standards for integrating scripting
languages with HTMLSee http://www.w3.org/TR/WD-script
• W3C working on Document Object Model (DOM) " .. a platform- and language-neutral interface that will allow programs and scripts to dynamically access and update the content, structure and style of documents."See http://www.w3.org/MarkUp/DOM/
25
Java
Java:• Development began by Sun in early 1990s
(known as Oak)• Moved to Web and released in 1995• Programming language and virtual machine
environment (provides portability and security)
• See http://java.sun.com/
26
Java ApplicationsJava is gaining momentum:• Interactive applications• Enhanced user interfaces• Replacing conventional
desktop applications• Extending browsers
http://www.mini.co.uk/
27
Java Standardisation
Java developments:• Sun submitting Java to standards body
(ISO/IEC JTC1)• Concerns over process ("Microsoft believes
that .. that Sun wishes to retain full ownership and control over its Java specifications ..")
• See http://java.sun.com/aboutJava/standardization/index.html
28
Distributed Searching - The Problem
End users face difficulties due to the wide variety of search interfaces available
29
Possible Solutions
Agree to use the same software• Unlikely to happen• Undesirable
Agree to use implement similar interfaces• Probably not feasible
Have a centralised database• Scaling problems
Use software which implements protocol designed to provide common search interface across diverse services
• e.g. Z39.50
30
An Applications Solution
Metacrawler can be used to search several large search engines.
Problems:• Breaks if APIs change
• Centralised system
http://www.metacrawler.com/
31
Z39.50 - What Is It?
Z39.50:• A protocol which specifies data structures and
interchange rules that allow a client machine to search databases on a server machine and retrieve records that are identified as a result of the search
• Maintained by Library of Congress• Developed by ZIG
Why is it important?• Powerful searching• Local, familiar interface• Retrieves structured data
32
Z39.50 HistoryZ39.50 (1988)
• NISO work with roots in OSI work
• "an unimplementable abomination which should never have been adopted"
• "Inspired" WAIS (which was not interoperable)
Z39.50 (1992)• Implementation experience• OSI now regarded as failure
Z39.50 (version 3)• Accepted as ISO standard in 1996 ISO (23950)• Implemented using TCP/IP• Toolkits, profiles, etc now available
Taken from Clifford Lynch's article at http://hosted.ukoln.ac.uk/mirrored/lis-journals/dlib/dlib/dlib/april97/04contents.html
33
Z39.50 Pilot
UKOLN is piloting Z39.50 across a number of services (UKOLN web site, BUBL, eLib project database, ...)
Imagine searching across JISC services (and institutions):
Find the chemical XML browser, and relevant reviews & papers.Search HENSA software archive, Mailbase lists, a Chemistry gateway and Imperial college web site
34
Related Protocols
LDAP Lightweight Directory Access ProtocolDerived from X.500 directory service See "Lightweight Directory Access Protocol" http://ds.internic.net/rfc/rfc1777.txt
See also http://www.novell.com/products/nds/ldap.html
http://www.critical-angle.com/ldapworld/Welcome.html
whois++ Derived for whois protocol for finding people (IETF)See "Architecture of the Whois++ Index Service" at the URL http://ds.internic.net/rfc/rfc1913.txt
35
What The Software Companies Say
Netscape (see http://search.netscape.com/newsref/std/standards_qa.html)
• [We will] aggressively support open standards wherever they exist
• Work within the open standards process to innovate valuable new functionality in ways that promote openness and interoperability.
• All current Netscape products implement and support the existing open standards appropriate to their functionality.
Microsoft (see http://premium.microsoft.com/msdn/library/sdkdoc/inetcsdk_2htc.htm)
• Microsoft is fully committed to the HTML standards articulated by the World Wide Web Consortium (W3C) and the international Internet community.
36
Caveat Emptor!Beware of free software - it can be expensive!
Remember Your Music Collection?
7" single Your favourite single12" LP The album containing the hit12" LP Greatest hits
CD When you bought your CD
Record companies are happy to sell you the same information in several formats!
Remember Your Music Collection?
7" single Your favourite single12" LP The album containing the hit12" LP Greatest hits
CD When you bought your CD
Record companies are happy to sell you the same information in several formats!
Is The Same True Of Your Information Systems?
Home-grownGopher The hit of 1992WWW The HTML 2 versionWWW (2) Revamped, based on
Netscapeisms WWW (3) Revamped, based on
HTML 4 and CSSWWW (4) ??
Microsoft and Netscape will be happy to sell you tools to manipulate the same information!
Is The Same True Of Your Information Systems?
Home-grownGopher The hit of 1992WWW The HTML 2 versionWWW (2) Revamped, based on
Netscapeisms WWW (3) Revamped, based on
HTML 4 and CSSWWW (4) ??
Microsoft and Netscape will be happy to sell you tools to manipulate the same information!
37
Conclusions
• Without standards, costs are liable to escalate• Software companies are happy to take our money• OSI networking standard gave standardisation
process a bad name• Current IETF / W3C process of developing
standards and gaining implementation experience is valuable
• Standards are not frozen• The difficult choice may be "What standard?"
38
Further InformationList of Standards Bodies
http://www.yahoo.com/Reference/Standards/http://www.iso.ch/VL/Standards.htmlhttp://www.cmpcmm.com/cc/standards.html
World Wide Web Consortiumhttp://www.w3.org/
IETFhttp://www.ietf.cnri.reston.va.us/home.htmlhttp://info.isoc.org/home.html
ISOhttp://www.iso.ch/welcome.html
ECMAhttp://www.ecma.ch/
ISO-HTMLftp://ftp.cs.tcd.ie/isohtml/
Microsoft and Standardshttp://www.microsoft.com/standards/
Netscape and Standardshttp://search.netscape.com/newsref/std/standards_qa.html
39
On Julius Caesar, Queen Eanfleda, and the lessons from time past1 Dual standards rather than a single standard cause trouble.2 If you must have dual standards, specify mandatory
conversions or interfaces between them.3 Never leave anything implementation-dependent4 If irregularities are unavoidable in a standard (e.g. because of
external constraints), put them where they will do the least damage.
5 Never alter standards to please the rich and powerful, unless the changes can be justified on firm technical grounds.
6 Even the most rich and powerful can be persuaded that they will benefit from changing from their local standard to a general one.
7 The most effective standards are those you take so for granted you don't have to think about them.
8 If provisions of standards are based on external assumptions or constraints unrelated to the purpose of the standard, they are likely to appear irrational.http://www.kcl.ac.uk/kis/support/cc/staff/brian/caesar.html