XML for .NET
description
Transcript of XML for .NET
XML for .NET
Session 1Introduction to XML Introduction to XSLT
Programmatically Reading XML DocumentsIntroduction to XPATH
Introduction to XML XML stands for eXtensible Markup
Language. XML is a way of specifying self-describing
data in a human-readable format. XML’s structure is like that of HTML – it
consists of nested tags, or elements. Elements can contain text data, attributes, and other elements.
XML is ideal for describing hierarchical data.
The Components of XML XML document can be comprised
of the following components: Processing Directives A Root Element (required) Elements
Attributes Comments
An Example XML Document
<?xml version="1.0" encoding="UTF-8" ?><books>
<!-- Each book is represented by a <book> element -->
<book price="34.95"> <title>Teach Yourself ASP 3.0 in 21
Days</title><authors>
<author>Mitchell</author> <author>Atkinson</author>
</authors> <year>1999</year>
</book></books>
An Example XML Document The first line of an XML document is
referred to as a processing directive, and indicates the version of the XML standard being used and the encoding.<?xml version="1.0" encoding="UTF-8" ?>
In general, processing directives have the following syntax:
<?processing directive content ?>
All XML Documents must have precisely one root element. The first element in the XML document is referred to as the root element.
For the books example, the root element is <books>:
<?xml version="1.0" encoding="UTF-8" ?><books> …</books>
An Example XML Document
An Example XML Document XML documents can contain an arbitrary
number of elements within the root element. Elements have a name, can have an arbitrary number of attributes, and can contain either text content or further elements.
The generic syntax of an element is given as:
<elementName attribute1="value1" … attributeN="valueN">text content or further elements
</elementName>
An Example XML Document
For example, in the books XML document, the <book> element contains an attribute named price and three child elements: <title>, <authors>, and <year>.
<book price="34.95"> <title>…</title><authors> …</authors> <year>…</year> </book>
An Example XML Document The <title> and <year> elements
contain text content.
<title>Teach Yourself ASP 3.0 in 21 Days</title>
<year>1999</year>
An Example XML Document The <authors> element contains two
<author> child elements. The <author> elements contain text content, specifying the author(s) of the book:
<authors><author>Mitchell</author> <author>Atkinson</author>
</authors>
An Example XML Document XML comments are delimited by
<!-- and -->. XML comments are ignored by an XML parser and merely make the XML document easier to read for a human.<!-- Each book is represented by a <book> element -->
XML Formatting Rules XML is case sensitive. XML Documents must adhere to
specific formatting rules. An XML document that adheres to such rules is said to be well-formed.
These formatting rules are presented over the next few slides.
Matching Tags XML elements must consist of both an
opening tag and a closing tag. For example, the element <book> has an opening tag - <book> - and must also have a matching closing tag - </book>.
For elements with no contents, you can use the following shorthand notation:<elementName optionalAttributes />
Elements Must be Properly Nested Examples of properly nested elements:
<book price="39.95"> <title>XML and ASP.NET</title></book>
Example of improperly nested elements:<book price="39.95"> <title>XML and ASP.NET</book> </title>
Attribute Values Must Be Quoted Recall that elements may have an
arbitrary number of attributes, which are denoted as:<elementName attribute1="value1" ... attributeN="valueN"> … </elementName>
Note that the attribute values must be delimited by quotation marks. That is, the following is malformed XML:
<book price=39.95> … </book>
Illegal Characters for Text Content Recall that elements can contain text
content or other elements. For example, the <title> element contains text, providing the title of the book: <title>XML for ASP.NET</title>
There are five characters that cannot appear within the text content: <, >, &, ", and '.
Replacing the Illegal Characters with Legal Ones
If you need to use any of those four illegal characters in the text portion of an element, replace them with their equivalent legal value:
Replace This
With This
< <> >& &" "' '
Namespaces Element names in an XML document can be
invented by the XML document’s creator. Due to this naming flexibility, when
computer programs work with different XML documents there may be naming conflicts.
For example, appending two XML documents together that use the same element name for different data representations leads to a naming conflict.
Naming Conflicts For example, if the following two XML
documents were to be merged, there would be a naming conflict:
<bug> <genus>Homoptera</genus> <species>Pseudococcidae</species></bug>
<bug> <date>2003-05-22</date> <description>The button, when clicked, raises a GPF exception.</description></bug>
Solving Name Conflicts with Namespaces A namespace is a unique name
that is “prefixed” to each element name to uniquely identify it.
To create namespaces, use the xmlns attribute:
<namespace-prefix:elementName xmlns:namespace-prefix="namespace">
E.g.:<animal:bug xmlns:animal="http://www.bugs.com/">
Namespaces Note that the xmlns namespace attribute contains
two parts:1. The namespace-prefix and,2. The namespace value.
The namespace-prefix can be any name. The namespace-prefix is then used to prefix element names to indicate that the element belongs to a particular namespace.
The namespace for an element automatically transfers to its children elements, unless explicitly specified otherwise.
Namespaces The namespace value must be a
globally unique string. Typically, URLs are used since they are unique to companies/developers. However, any string value will suffice, so long as it is unique.
The Default Namespace To save typing in the namespace-
prefix for every element, a default namespace can be specified using xmlns without the namespace-prefix part, like:
<elementName xmlns="namespace">
Namespaces in Action In later sections on XSLT and XSD we’ll see
namespaces in use in real-world XML situations…
Note that most of our XML examples will not use namespaces. Always using namespaces is a good habit to get into, but can add confusion/clutter to examples when learning… (Hence their omission where possible.)
Comparing XML to HTML HTML, or HyperText Markup
Language, is the syntax used to describe a Web page’s appearance.
At first glance, there appears to be a lot in common between HTML and XML – their syntax appears similar, they both consist of elements, attributes, and textual content, and so on.
Comparing XML to HTML However, XML and HTML are more different than
they are alike. With XML, you must create your own elements. With
HTML, your elements must come from a predefined set of legal elements (<b>, <p>, <body>, etc.)
XML is used solely to describe data; other technologies must be used if you want to present the XML in a formatted manner. HTML describes primarily presentation. That is, you use the <b> tag to make something bold, not to indicate its meaning.
Comparing XML to HTML XML documents must be well-formed,
while HTML documents are not confined to such rigorous rules.
The following legal HTML is illegal XML (why?):
<p align=center><b>I said, “Hello, World!”
</P></b>
•Tags improperly nested
•Unquoted attribute value
•<p> tag and </P> do not match (due to case-sensitivity)
•Contains illegal characters (") in text
Comparing XML to HTML Realize that XML is not a
replacement to HTML. XML and HTML have different aims – XML was created to describe data, HTML was designed to present data.
Comparing XML to Traditional Databases Imagine that we have information
about books stored in two formats:1. In a traditional database, say Microsoft
SQL Server 2000.2. In an XML document
What advantages does having the data represented as XML provide?
XML’s Benefits Over Traditional Databases XML data is human-readable. A human
glancing at the XML document can determine what data each element and attribute indicates.
XML is self-describing. With a traditional database, you would need to view the table schemas to determine the purpose of the data. With XML, the purpose is reflected by the names of the elements and their nesting.
XML’s Benefits Over Traditional Databases Also, since XML is naturally hierarchical,
determining the relationship among entities is much simpler in an XML document than in a relational database.
XML data is simply text content, meaning the data is platform-neutral. For example, the XML book data could be utilized by a program running on a Linux box, whereas the raw SQL Server 2000 data is only useful to a Windows box running SQL Server 2000.
Creating Our First XML Document Imagine that we wanted to be able to
catalog our music collection. Being interested in learning XML, we decide to create an XML document that stores the information about our music library.
First, we must decide what elements our XML document will consist of…
Creating Our First XML Document All XML documents need a root element.
What should we call our root element? Now, our root element will contain an
arbitrary number of children elements. Each child element will represent a single item in our music library. What should these child elements be named? Should they have any attributes? What elements should they contain?(Diagram XML document on board.)
Creating the XML Document with Visual Studio .NET
Visual Studio .NET makes it easy to create XML files. Go to File > New > File and select the XML File option from the New File dialog box:
Creating the XML Document with Visual Studio .NET This will create a blank XML file
with the <?xml …?> preprocessing directive.
At this point, you can type in the elements for the XML document. Note that after typing in an opening tag, VS.NET automatically adds the corresponding closing tag.
A Peek Ahead: Defining the XML Document’s Schema
Realize that we have yet to formally specify the structure of the XML document. In our discussions we arrived at modeling our music library using a certain set of XML elements, but this particular data model is known only to us.
XML documents can contain schemas that describe their structure. In the next session we’ll discuss XML Schemas in greater detail. For now, just observe that Visual Studio .NET allows us to enter any XML content we want, and we can enter content that violates the schema we verbally agreed on just a few moments ago.
An Hierarchical Way to Think of XML Documents When working with XML documents, it
helps to think of these documents as a hierarchy, where the top of the hierarchy is the root element.
At each “level” in the hierarchy there are elements, attributes, or text content.
This hierarchy-model of XML documents is commonly used in accessing XML documents programmatically, and hence is worth examining.
What is the DOM? DOM stands for Document Object Model,
and it’s a model that can be used to describe an XML document.
The DOM expresses the XML document as a hierarchy of elements, where each element can have zero to many children elements.
The text content and attributes of an element are expressed as its children as well.
Why Discuss the DOM? Being able to think of an XML document
in terms of the DOM helps in understanding how XPath, XSLT, and related concepts work.
Also, later we will examine how to access an XML document programmatically. When working with XML documents this way, the XML documents are presented as a DOM.
Example XML File
<?xml version="1.0" encoding="UTF-8" ?><books>
<book price="34.95"> <title>TYASP 3.0</title><authors>
<author>Mitchell</author> </authors>
</book><book price=“29.95">
<title>ASP.NET Tips</title><authors>
<author>Mitchell</author><author>Walther</author><author>Seven</author>
</authors> </book>
</books>
The DOM View of the XML Document
Understanding the DOM With the DOM model we can think of each XML
element as having an arbitrary number of children elements.
Children element are comprised of both other elements as well as the text content and attributes.
Therefore, to visit every element in an XML document we could start with the root element and visit each of its children. Then, for each child being visited, we can recursively visit all of its children.
If this makes zero sense, don’t worry, we’ll discuss this in more detail later!
Summary In this presentation we examined XML
basics, including the components of XML and the rules for an XML document to be considered well-formed.
We compared XML to HTML and traditional data stores.
We created our first XML document using Visual Studio .NET.
We discussed the Document Object Model.
Questions?