XML as Part of a Total Information Management Strategy for STI Dr. Simon Liu

18
XML as Part of a Total Information Management Strategy for STI Dr. Simon Liu Director, Information Systems April 30, 2003

description

XML as Part of a Total Information Management Strategy for STI Dr. Simon Liu Director, Information Systems April 30, 2003. Agenda. Introduction XML Core Standards Domain Specific XML-Based Standards XML As An Information Management Strategy NLM XML Applications Implementation Approach - PowerPoint PPT Presentation

Transcript of XML as Part of a Total Information Management Strategy for STI Dr. Simon Liu

Page 1: XML as Part of a Total Information Management Strategy for STI Dr. Simon Liu

XML as Part of a Total Information Management Strategy for STI

Dr. Simon LiuDirector, Information Systems

April 30, 2003

Page 2: XML as Part of a Total Information Management Strategy for STI Dr. Simon Liu

Agenda

• Introduction– XML Core Standards

– Domain Specific XML-Based Standards

• XML As An Information Management Strategy• NLM XML Applications• Implementation Approach • Lessons Learned • Questions & Answers

Page 3: XML as Part of a Total Information Management Strategy for STI Dr. Simon Liu

XML Core Standards (I)

XML

XLink

XPointerXLL

XSLXPath

XSLT

XSL

Working Draft Published Recommendation

1997 1998 1999 2000 2001

DOM

XML Namespaces

Completion Slipped

RDF Syntax

XML-Schema

MCF

XML-Data

WebCollections

Page 4: XML as Part of a Total Information Management Strategy for STI Dr. Simon Liu

XML Core Standards (II)

• Extensible Markup Language (XML) – the foundation specification that defines the character set and rules for constructing XML element names, attributes and structures

• XML Linking Language (XLink) to provide links and link management among content components

• XML Pointer Language (XPointer) to reference content components, which may be identified with XML entities

• Extensible Stylesheet Language (XSL) to associate presentation characteristics (e.g., layout) with XML markup

• XSL Transformations (XSLT) to control views of XML documents and ordering of XML elements

• XML Path Language (XPath) for referencing of both labeled (e.g., <element_name>) and unlabeled content components of XML documents, used by XSLT and XPointer

Page 5: XML as Part of a Total Information Management Strategy for STI Dr. Simon Liu

XML Core Standards (III)

• Resource Description Framework (RDF) for metadata exchange among applications — it defines Web resources, their properties and values of those properties

• XML Schema defines XML data structures with data specification and data typing information, something not included in the older document type definitions (DTDs) for XML structures

• XML Namespaces determines the interpretation of specific element and attribute names (i.e., strings) by associating them with referenced dictionaries (namespaces)

• Document Object Model (DOM) is a standard set of programmatic calls — i.e., application programming interfaces (APIs) — for building, navigating, identifying and reading/writing to identifiable components (i.e., elements or attributes) of XML documents (i.e., data structures)

Page 6: XML as Part of a Total Information Management Strategy for STI Dr. Simon Liu

• Accounting (14)• Advertising (6)• Aerospace (17)• Arts/Entertainment (24)• Astronomy (14)• Automotive (14)• Banking (10)• Biology (8)• Computer (9)• Construction (8)• Consulting (20)• Customer Relation (8)• Databases (10)• E-Commerce (60)• EDI (18)• Education (51)• Energy/Utilities (33)

• Financial Service (52)• Healthcare (23)• Human Resources (23)• Internet/Web (35)• Legal (10)• Literature (14)• Manufacturing (8)• Multimedia (24)• News (10)• Publishing/Print (28)• Real Estate (15)• Retail (6)• Science (61)• Software (124)• Supply Chain (23)• Telecommunications (23)• XML Technologies (232)

Domain Specific XML-Based Standards

1000+ domain specific XML-based standards are developed & registered in XML.ORG (OASIS) currently

Page 7: XML as Part of a Total Information Management Strategy for STI Dr. Simon Liu

The Challenge

We are moving to Electronic Business...

Our data (documents) is distributed...

Our users are distributed…

But where is the common denominator?

ExpertsAuthors ReviewersEditors Publishers

Page 8: XML as Part of a Total Information Management Strategy for STI Dr. Simon Liu

A Viable Solution

XML is a viable option to manage the diversity of data, applications and devices

of Electronic Information Applications

ExpertsAuthors ReviewersEditors Publishers

Page 9: XML as Part of a Total Information Management Strategy for STI Dr. Simon Liu

XML As An Information Management Strategy

Collecting Electronic Information

Authoring Electronic Information

Storing Electronic Information

Publishing Electronic Information

Exchanging Electronic Information

Retrieving Electronic Information

Page 10: XML as Part of a Total Information Management Strategy for STI Dr. Simon Liu

Keyboard Contractors

OCR

MEDLINE Database

Collection Electronic Information

XML Loader

XML

XML

XML

Publishers

•XML•XSL•XSLT•XML Schema

Page 11: XML as Part of a Total Information Management Strategy for STI Dr. Simon Liu

Authoring Electronic Information

XML Loader

•XML•XSL•XSLT•DOM•XML Schema

Page 12: XML as Part of a Total Information Management Strategy for STI Dr. Simon Liu

Storing Electronic Information

Textkj flsjd kjs lskjlkj lskjd lksjl fslk jdlksj fksjdlkjlkjf lskjdlkjf slkjkj flskdjljdkfj s lkjlkjlsd s dfl skjd f slkdjflskdj

lslkjdflk lskjd lfksjdlk lskdjfl aölskjdfölskdjf söldkfjlskdj föaslkdjlskdjf ösldkfjlskd föalskdj

ksjdlfkjslkjd

ExistingDatabases

Projectdata

Process descriptions

Images

Video

Audio

www.nlm.gov TextDocuments

Page 13: XML as Part of a Total Information Management Strategy for STI Dr. Simon Liu

Publishing Electronic Information

PubMed Voyager

MeSHDCMS

Gateway

MEDLINE Database

MeSH Database

Voyager Database

•XML•XSL•XSLT•XML Schema

Page 14: XML as Part of a Total Information Management Strategy for STI Dr. Simon Liu

Exchanging Electronic Information

DOCLINEPubMed

VoyagerJournalArticles

Monographs,Audiovisuals,

Serials

MEDLINEDistributionPublication

SEF

MeSH

DCMS

Gateway

MeSHDistribution

•XML•XSL•XSLT•XML Schema

Page 15: XML as Part of a Total Information Management Strategy for STI Dr. Simon Liu

HTTP Serverreceives page request from

the userAccent Server

1. Page request is passed.2. Servlet retrieves requested page.3. Servlet searchs page fornarratable content (using the definedconstruct -<a id=”npXXX”><p>content</p></a>4. Identified narratable content istranscoded and concatenated, audioclips generated and stored.5. Requested page hasmagnification controls and anynecessary scripting added.6. Retagged page is then returned tothe user.

UNIX

Page request forwarded toAccent Server for TTS

processing

Requested Page withTTS objects,magnification

NT Server

FIREWALL

Internet

MacLaptopPC

Remote (Logged) Users

Auto-

Index

ing &

Tran

scod

ing

TTS Service

Data parsed intophonemes

Dictionary matchesphonemes andgrammar style

Voice, gender, rate,audio format are

applied

Audio phonemes areconcatenated

QFI-Transcoderformats/compressesaudio, indexes page

Page Indexed?

NO

Sen d

dat a

to T

T S S

ervic

e

QuickFile Index(QFI) retrieved and

interpolated withpage

YES

Acquiremeta-data

Acquiremeta-data

Write data output stream

Audio

, text,

scrip

t

Page request

Retrieving Electronic Information

•XSL/XSLT•Voice XML•DOM •SOAP•SALT

Page 16: XML as Part of a Total Information Management Strategy for STI Dr. Simon Liu

Implementation Approach

• Form joint XML committees/working groups

• Provide XML education

• Build an XML community

• Cooperate with partners

• Participate in standard organizations

• Assume the leadership

• Start from core XML then to domain specific XML-based standards

• Apply XML to both research & operation projects

Page 17: XML as Part of a Total Information Management Strategy for STI Dr. Simon Liu

Lessons Learned

• Take a broad & holistic approach• Commit for the long haul• Understand the core standards• Keep abreast of domain specific standard

development• Don’t do it all at once• Don’t go it alone• Include security in the process

Page 18: XML as Part of a Total Information Management Strategy for STI Dr. Simon Liu

Q&A