Xpath Notes

download Xpath Notes

of 29

Transcript of Xpath Notes

  • 8/8/2019 Xpath Notes

    1/29

    Processing instructions

    Description

    Processing instructions (PIs) allow documents to contain instructions for applications.

    Well-formed documents

    Processing instructions :

    vscht

    Documents with errors

    Processing instruction must end with ?>:

  • 8/8/2019 Xpath Notes

    2/29

    XMLDeclaration

    Description

    XML documents may, and should, begin with an XML declaration which specifies the version ofXML being used.

    Well-formed documents

    XML version specification :

    This document conforms to the XML 1.0 specification.Encoding specification :

    If encoding is not given UTF-8 is assumed

    XPath 1.0 Tutorial

    Back|Forward||Previous|Next

    XPath is described in XPath 1.0 standard. In this tutorial selected XPath features aredemonstrated on many examples.

    Standard excerpt:XPath is the result of an effort to provide a common syntax and semantics for functionality

    shared between XSL Transformations and XPointer. The primary purpose ofXPath is to address

    parts of anXML

    document. In support of this primary purpose, it also provides basic facilitiesfor manipulation of strings, numbers and booleans. XPath uses a compact, non-XML syntax tofacilitate use ofXPath within URIs and XML attribute values. XPath operates on the abstract,

    logical structure of an XML document, rather than its surface syntax. XPath gets its name fromits use of a path notation as in URLs for navigating through the hierarchical structure of an XML

    document.

    List ofXPaths

    XPath as filesystem addressing

    The basicX

    Path syntax is similar to filesystem addressing. If the path starts with the slash / ,then it represents an absolute path to the required element./AAA

    /AAA/CCC/AAA/DDD/BBB

    Start with //If the path starts with // then all elements in the document which fulfill following criteria are

    selected.

  • 8/8/2019 Xpath Notes

    3/29

    //BBB//DDD/BBB

    All elements: *The star * selects all elements located by preceeding path

    /AAA/CCC/DDD/*

    /*/*/*/BBB//*Further conditions inside []

    Expresion in square brackets can further specify an element. A number in the brackets gives theposition of the element in the selected set. The function last() selects the last element in the

    selection./AAA/BBB[1]

    /AAA/BBB[last()]Attributes

    Attributes are specified by @ prefix.//@id

    //BBB[@id]//BBB[@name]

    //BBB[@*]//BBB[not(@*)]

    Attribute valuesValues of attributes can be used as selection criteria. Function normalize-space removes leading

    and trailing spaces and replaces sequences of whitespace characters by a single space.//BBB[@id='b1']

    //BBB[@name='bbb']//BBB[normalize-space(@name)='bbb']

    Nodes countingFunction count() counts the number of selected elements

    //*[count(BBB)=2]//*[count(*)=2]

    //*[count(*)=3]Playing with names of selected elements

    Function name() returns name of the element, the starts-with function returns true if the firstargument string starts with the second argument string, and the contains function returns true if

    the first argument string contains the second argument string.//*[name()='BBB']

    //*[starts-with(name(),'B')]//*[contains(name(),'C')]

    Length of stringThe string-length function returns the number of characters in the string. You must use < as a

    substitute for < and > as a substitute for > .//*[string-length(name()) = 3]

    //*[string-length(name()) < 3]//*[string-length(name()) > 3]

    Combining XPaths with |Several paths can be combined with | separator.

  • 8/8/2019 Xpath Notes

    4/29

    //CCC | //BBB/AAA/EEE | //BBB

    /AAA/EEE | //DDD/CCC | /AAA | //BBBChild axis

    The child axis contains the children of the context node. The child axis is the default axis and it

    can be omitted./AAA/child::AAA

    /AAA/BBB/child::AAA/child::BBB

    /child::AAA/BBBDescendant axis

    The descendant axis contains the descendants of the context node; a descendant is a child or achild of a child and so on; thus the descendant axis never contains attribute or namespace nodes

    /descendant::*/AAA/BBB/descendant::*

    //CCC/descendant::*//CCC/descendant::DDD

    Parent axisThe parent axis contains the parent of the context node, if there is one.

    //DDD/parent::*Ancestor axis

    The ancestor axis contains the ancestors of the context node; the ancestors of the context nodeconsist of the parent of context node and the parent's parent and so on; thus, the ancestor axis

    will always include the root node, unless the context node is the root node./AAA/BBB/DDD/CCC/EEE/ancestor::*

    //FFF/ancestor::*Following-sibling axis

    The following-sibling axis contains all the following siblings of the context node./AAA/BBB/following-sibling::*

    //CCC/following-sibling::*Preceding-sibling axis

    The preceding-sibling axis contains all the preceding siblings of the context node/AAA/XXX/preceding-sibling::*

    //CCC/preceding-sibling::*Following axis

    The following axis contains all nodes in the same document as the context node that are after thecontext node in document order, excluding any descendants and excluding attribute nodes and

    namespace nodes./AAA/XXX/following::*

    //ZZZ/following::*Preceding axis

    The preceding axis contains all nodes in the same document as the context node that are beforethe context node in document order, excluding any ancestors and excluding attribute nodes and

    namespace nodes/AAA/XXX/preceding::*

  • 8/8/2019 Xpath Notes

    5/29

    //GGG/preceding::*Descendant-or-self axis

    The descendant-or-self axis contains the context node and the descendants of the context node/AAA/XXX/descendant-or-self::*

    //CCC/descendant-or-self::*

    Ancestor-or-self axisThe ancestor-or-self axis contains the context node and the ancestors of the context node; thus,the ancestor-or-self axis will always include the root node.

    /AAA/XXX/DDD/EEE/ancestor-or-self::*//GGG/ancestor-or-self::*

    Orthogonal axesThe ancestor, descendant, following, preceding and self axes partition a document (ignoring

    attribute and namespace nodes): they do not overlap and together they contain all the nodes inthe document.

    //GGG/ancestor::*//GGG/descendant::*

    //GGG/following::*//GGG/preceding::*

    //GGG/self::*//GGG/ancestor::* | //GGG/descendant::* | //GGG/following::* | //GGG/preceding::* |

    //GGG/self::*Numeric operations

    The div operator performs floating-point division, the mod operator returns the remainder from atruncating division. The floor function returns the largest (closest to positive infinity) number

    that is not greater than the argument and that is an integer.The ceiling function returns thesmallest (closest to negative infinity) number that is not less than the argument and that is an

    integer.//BBB[position() mod 2 = 0 ]

    //BBB[ position() = floor(last() div 2 + 0.5) or position() = ceiling(last() div 2 + 0.5) ]//CCC[ position() = floor(last() div 2 + 0.5) or position() = ceiling(last() div 2 + 0.5) ]

    XPath as filesystem addressing

    The basic XPath syntax is similar to filesystem addressing. If the path starts with the slash / ,

    then it represents an absolute path to the required element.

    /AAASelect the root element AAA

  • 8/8/2019 Xpath Notes

    6/29

    /AAA/CCC

    Select all elements CCC which are children of the root element AAA

    /AAA/DDD/BBB

    Select all elements BBB which are children of DDD which are children of the root element AAA

    Start with //

    If the path starts with // then all elements in the document which fulfill following criteria areselected.

    //BBB

    Select all elements BBB

  • 8/8/2019 Xpath Notes

    7/29

    //DDD/BBB

    Select all elements BBB which are children of DDD

    All elements: *

    The star * selects all elements located by preceeding path

    /AAA/CCC/DDD/*

    Select all elements enclosed by elements /AAA/CCC/DDD

  • 8/8/2019 Xpath Notes

    8/29

    /*/*/*/BBB

    Select all elements BBB which have 3 ancestors

  • 8/8/2019 Xpath Notes

    9/29

    //*

    Select all elements

    Further conditions inside []

    Expresion in square brackets can further specify an element. A number in the brackets gives theposition of the element in the selected set. The function last() selects the last element in the

    selection.

    /AAA/BBB[1]

  • 8/8/2019 Xpath Notes

    10/29

    Select the first BBB child of element AAA

    /AAA/BBB[last()]

    Select the last BBB child of element AAA

    Attributes

    Attributes are specified by @ prefix.

    //@id

    Select all attributes @id

    //BBB[@id]

    Select BBB elements which have attribute id

  • 8/8/2019 Xpath Notes

    11/29

    //BBB[@name]

    Select BBB elements which have attribute name

    //BBB[@*]

    Select BBB elements which have any attribute

    //BBB[not(@*)]

    Select BBB elements without an attribute

    Attribute values

    Values of attributes can be used as selection criteria. Function normalize-space removes leading

    and trailing spaces and replaces sequences of whitespace characters by a single space.

    //BBB[@id='b1']

    Select BBB elements which have attribute id with value b1

  • 8/8/2019 Xpath Notes

    12/29

    //BBB[@name='bbb']

    Select BBB elements which have attribute name with value 'bbb'

    //BBB[normalize-space(@name)='bbb']

    Select BBB elements which have attribute name with value bbb, leading and trailing spaces are

    removed before comparison

    Nodes counting

    Function count() counts the number of selected elements

    //*[count(BBB)=2]

    Select elements which have two children BBB

  • 8/8/2019 Xpath Notes

    13/29

    //*[count(*)=2]

    Select elements which have 2 children

    //*[count(*)=3]

    Select elements which have 3 children

    Playing with names of selected elements

  • 8/8/2019 Xpath Notes

    14/29

    Function name() returns name of the element, the starts-with function returns true if the first

    argument string starts with the second argument string, and the contains function returns true ifthe first argument string contains the second argument string.

    //*[name()='BBB']Select all elements with name BBB, equivalent with //BBB

    //*[starts-with(name(),'B')]

    Select all elements name of which starts with letter B

    //*[contains(name(),'C')]

  • 8/8/2019 Xpath Notes

    15/29

  • 8/8/2019 Xpath Notes

    16/29

    //*[string-length(name()) > 3]Select elements with name longer than three characters

    Combining XPaths with |

    Several paths can be combined with | separator.

    //CCC | //BBB

    Select all elements CCC and BBB

    /AAA/EEE | //BBB

    Select all elements BBB and elements EEE which are children of root element AAA

  • 8/8/2019 Xpath Notes

    17/29

    /AAA/EEE | //DDD/CCC | /AAA | //BBB

    Number of combinations is not restricted

    Child axis

    The child axis contains the children of the context node. The child axis is the default axis and it

    can be omitted.

    /AAA

    Equivalent of /child::AAA

    /child::AAA

    Equivalent of /AAA

    /AAA/BBB

    Equivalent of /child::AAA/child::BBB

  • 8/8/2019 Xpath Notes

    18/29

    /child::AAA/child::BBB

    Equivalent of /AAA/BBB

    /child::AAA/BBB

    Both possibilities can be combined

    Descendant axis

    The descendant axis contains the descendants of the context node; a descendant is a child or a

    child of a child and so on; thus the descendant axis never contains attribute or namespace nodes

    /descendant::*

    Select all descendants of document root and therefore all elements

  • 8/8/2019 Xpath Notes

    19/29

    /AAA/BBB/descendant::*Select all descendants of /AAA/BBB

    //CCC/descendant::*

    Select all elements which have CCC among its ancestors

  • 8/8/2019 Xpath Notes

    20/29

    //CCC/descendant::DDD

    Select elements DDD which have CCC among its ancestors

    Parent axis

    The parent axis contains the parent of the context node, if there is one.

    //DDD/parent::*

    Select all parents of DDD element

  • 8/8/2019 Xpath Notes

    21/29

    Ancestor axis

    The ancestor axis contains the ancestors of the context node; the ancestors of the context nodeconsist of the parent of context node and the parent's parent and so on; thus, the ancestor axis

    will always include the root node, unless the context node is the root node.

    /AAA/BBB/DDD/CCC/EEE/ancestor::*

    Select all elements given in this absolute path

    //FFF/ancestor::*

  • 8/8/2019 Xpath Notes

    22/29

    Select ancestors of FFF element

    Following-sibling axis

    The following-sibling axis contains all the following siblings of the context node.

    /AAA/BBB/following-sibling::*

  • 8/8/2019 Xpath Notes

    23/29

    //CCC/following-sibling::*

    Preceding-sibling axis

    The preceding-sibling axis contains all the preceding siblings of the context node

    /AAA/XXX/preceding-sibling::*

  • 8/8/2019 Xpath Notes

    24/29

    //CCC/preceding-sibling::*

    Following axis

    The following axis contains all nodes in the same document as the context node that are after thecontext node in document order, excluding any descendants and excluding attribute nodes and

    namespace nodes.

    /AAA/XXX/following::*

  • 8/8/2019 Xpath Notes

    25/29

    Preceding axis

    The preceding axis contains all nodes in the same document as the context node that are beforethe context node in document order, excluding any ancestors and excluding attribute nodes and

    namespace nodes

    /AAA/XXX/preceding::*

  • 8/8/2019 Xpath Notes

    26/29

    Descendant-or-self axis

    The descendant-or-self axis contains the context node and the descendants of the context node

    /AAA/XXX/descendant-or-self::*

  • 8/8/2019 Xpath Notes

    27/29

    Ancestor-or-self axis

    The ancestor-or-self axis contains the context node and the ancestors of the context node; thus,the ancestor-or-self axis will always include the root node.

    /AAA/XXX/DDD/EEE/ancestor-or-self::*

    Orthogonal axes

    The ancestor, descendant, following, preceding and self axes partition a document (ignoring

    attribute and namespace nodes): they do not overlap and together they contain all the nodes inthe document.

  • 8/8/2019 Xpath Notes

    28/29

    //GGG/ancestor::*

    Numeric operations

    The div operator performs floating-point division, the mod operator returns the remainder from atruncating division. The floor function returns the largest (closest to positive infinity) number

    that is not greater than the argument and that is an integer.The ceiling function returns thesmallest (closest to negative infinity) number that is not less than the argument and that is an

    integer.

    //BBB[position() mod 2 = 0 ]

    Select even BBB elements

  • 8/8/2019 Xpath Notes

    29/29

    //BBB[ position() = floor(last() div 2 + 0.5) or position() = ceiling(last() div 2 + 0.5) ]

    Select middle BBB element(s)

    //CCC[ position() = floor(last() div 2 + 0.5) or position() = ceiling(last() div 2 + 0.5) ]Select middle CCC element(s)