XPath and XQuery - Logowanie - Uniwersytet Warszawskiczarnik/zajecia/xml10/07-en... ·  ·...

19
XPath and XQuery Patryk Czarnik Institute of Informatics University of Warsaw XML and Modern Techniques of Content Management – 2010/11 Introduction Status XPath Data Model XPath language Basic constructs XPath 2.0 extras Paths XQuery XQuery query structure Constructors Functions

Transcript of XPath and XQuery - Logowanie - Uniwersytet Warszawskiczarnik/zajecia/xml10/07-en... ·  ·...

Page 1: XPath and XQuery - Logowanie - Uniwersytet Warszawskiczarnik/zajecia/xml10/07-en... ·  · 2010-11-29XPath and XQuery Data Model I Theoretical basis of XPath, XSLT, ... functions

XPath and XQuery

Patryk Czarnik

Institute of Informatics University of Warsaw

XML and Modern Techniques of Content Management – 2010/11

IntroductionStatusXPath Data Model

XPath languageBasic constructsXPath 2.0 extrasPaths

XQueryXQuery query structureConstructorsFunctions

Page 2: XPath and XQuery - Logowanie - Uniwersytet Warszawskiczarnik/zajecia/xml10/07-en... ·  · 2010-11-29XPath and XQuery Data Model I Theoretical basis of XPath, XSLT, ... functions

XPath and XQueryQuerying XML documents

Common properties

I Expression languages designed to query XML documentsI Convenient access to document nodesI Intuitive syntax analogous to filesystem pathsI Comparison and arithmetic operators, functions, etc. . .

XPathUsed within other standards:

I XSLTI XML SchemaI XPointerI DOM

XQueryStandalone standard. Mainapplications:

I XML data access andprocessing

I XML databases

XPath — status

XPath 1.0

I W3C Recommendation, XI 1999I used within XSLT 1.0, XML Schema, XPointer

XPath 2.0I Several W3C Recommendations, I 2007:

I XML Path Language (XPath) 2.0I XQuery 1.0 and XPath 2.0 Data ModelI XQuery 1.0 and XPath 2.0 Functions and OperatorsI XQuery 1.0 and XPath 2.0 Formal Semantics

I Used within XSLT 2.0I Related to XQuery 1.0

Page 3: XPath and XQuery - Logowanie - Uniwersytet Warszawskiczarnik/zajecia/xml10/07-en... ·  · 2010-11-29XPath and XQuery Data Model I Theoretical basis of XPath, XSLT, ... functions

XPath and XQuery Data ModelI Theoretical basis of XPath, XSLT, and XQueryI XML document treeI Structures and simple data typesI Basic operations (type conversions etc.)

Differences between 1.0 and 2.0XPath 1.0 XPath 2.0

simple data types boolean,number,string

all XML Schema simpletypes

structures node sets sequences of nodes and sim-ple values

XML document in XPath modelI Document treeI Physical representation level fully expanded

I CDATA, references to characters and entities

I No adjacent text nodesI Namespaces applied and accessibleI XML Schema applied and accessible

! XPath 2.0 “schema aware” processors only

I Attribute nodes as element “properties”! formally, attribute is not child of element! however, element is parent of its attributes

I Root of tree — document node! main element (aka document element) is not the root

Page 4: XPath and XQuery - Logowanie - Uniwersytet Warszawskiczarnik/zajecia/xml10/07-en... ·  · 2010-11-29XPath and XQuery Data Model I Theoretical basis of XPath, XSLT, ... functions

XPath node kindsSeven kinds of nodes:

I document node (root)I elementI attributeI text nodeI processing instructionI commentI namespace node

SequencesI Values in XPath 2.0 — sequencesI Sequence consists of zero or more items

I nodesI atomic values

Sequences properties

I Items order and number of occurrence meaningfulI Singleton sequence equivalent to its item:3.14 = (3.14)

I Nested sequences implicitly flattened to canonicalrepresentation:(3.14, (1, 2, 3), ’Alice’) = (3.14, 1, 2, 3, ’Alice’)

Page 5: XPath and XQuery - Logowanie - Uniwersytet Warszawskiczarnik/zajecia/xml10/07-en... ·  · 2010-11-29XPath and XQuery Data Model I Theoretical basis of XPath, XSLT, ... functions

Data model in XPath 1.0I Four types:

I booleanI stringI numberI node set

I No collection of simple valuesI Sets (and not sequences) of nodes

Page 6: XPath and XQuery - Logowanie - Uniwersytet Warszawskiczarnik/zajecia/xml10/07-en... ·  · 2010-11-29XPath and XQuery Data Model I Theoretical basis of XPath, XSLT, ... functions

Effective Boolean ValueI Treating any value as booleanI Motivation: convenience in condition writing,

e.g. if ( person[@passport] ) ...

Conversion rulesempty sequence → falsesequence starting with node → truesingle boolean value → that valuesingle empty string → falsesingle non-empty string → truesingle number equal to 0 or NaN → falseother single number → trueother value → error

AtomizationI Treating any sequence as sequence of atomic valuesI Motivation: sequences comparison, arithmetic, type casting

Conversion rules (for each item)

atomic value → that value

node of declared atomic type → node value

node of list type → sequence of list elements

node of unknown simple type orone of xs:untypedAtomic,xs:anySimpleType

→ text content as singleitem

node with mixed content → text content as singleitem

node with element content → error

Page 7: XPath and XQuery - Logowanie - Uniwersytet Warszawskiczarnik/zajecia/xml10/07-en... ·  · 2010-11-29XPath and XQuery Data Model I Theoretical basis of XPath, XSLT, ... functions

Paths — typical XPath applicationI /company/department/person

I //person

I /company/department[name = ’accountancy’]

I /company/department[@id = ’D07’]/person[3]

I ./surname

I surname

I ../person[position = ’manager’]/surname

But there is much more to learn about XPath :)

Literals and variables

LiteralsI strings: ’12.5’, "He said, ""I don’t like it."""

I numbers: 12, 12.5, 1.13e-8

VariablesI $x — reference to variable x,I Variables introduced through:

I XPath 2.0 (for, some, every)I XQuery (FLWOR, some, every, function parameters)I XSLT 1.0 and 2.0 (variable, param)

Page 8: XPath and XQuery - Logowanie - Uniwersytet Warszawskiczarnik/zajecia/xml10/07-en... ·  · 2010-11-29XPath and XQuery Data Model I Theoretical basis of XPath, XSLT, ... functions

Type casting

Type constructors

I xs:date("2010-08-25")

I xs:float("NaN")

I adresy:kod-pocztowy("48-200") (schema aware processing)I string(//obiekt[4]) (valid in XPath 1.0 too)

cast as operator

I "2010-08-25" cast as xs:date

I . . .

FunctionsI Function invocation:

I concat(’Mrs ’, name, ’ ’, surname)I count(//person)I my:fac(12)

I 150 (XPath 2.0) built-in functions:I Custom functions defining:

I XQueryI XSLT 2.0I execution environment

I EXSLT — de-facto standard of additional XPath functions andextension mechanism for XSLT 1.0

Page 9: XPath and XQuery - Logowanie - Uniwersytet Warszawskiczarnik/zajecia/xml10/07-en... ·  · 2010-11-29XPath and XQuery Data Model I Theoretical basis of XPath, XSLT, ... functions

Chosen built-in XPath functions

Text

concat(s1, s2, ...) substring(s, pos, len)starts-with(s1, s2) contains(s1, s2)string-length(s) translate(s, t1, t2)

Numbersfloor(x) ceiling(x)round(x) abs(x)

Nodes

name(n?) local-name(n?) namespace-uri(n?)id(s) nilled(n?) document-uri(doc)

Contextcurrent() position()last() current-time()

Sequences

count(S) sum(S) min(S) max(S) avg(S)empty(S) reverse(S) distinct-values(S)

Date and timemonth-from-date(t)adjust-date-to-timezone(t, tz)

OperatorsI 68 operators in XPath 2.0 (after overloading expansion)

Arithmetic

I + - * div idiv mod on numbersI + and - on date/time and duration

Node sets / sequences

I union | intersect except

I not nodes found — type errorI result without repeats, document order preserved

Logical values

I and or

Page 10: XPath and XQuery - Logowanie - Uniwersytet Warszawskiczarnik/zajecia/xml10/07-en... ·  · 2010-11-29XPath and XQuery Data Model I Theoretical basis of XPath, XSLT, ... functions

Comparison operators

Atomic comparison (XPath 2.0 only)

I eq ne lt le gt ge

I applied to singletons

General comparison (XPath 1.0 and 2.0)

I = != < <= > >=

I applied to sequencesI XPath 2.0 semantics:

There exists a pair of items, one from each argument sequences, forwhich the corresponding atomic comparison holds.(Argument sequences atomized on entry.)

General comparison — nonobvious behaviour

Equality operator does not check the real equality

(1, 2) = (2, 3) – true(1, 2) != (1, 2) – true

Equality is not transitive

(1, 2) = (2, 3) – true(2, 3) = (3, 4) – true(1, 2) = (3, 4) – false

Inequality is not just equality negation

(1, 2) = (1, 2) – true(1, 2) != (1, 2) – true() = () – false() != () – false

Page 11: XPath and XQuery - Logowanie - Uniwersytet Warszawskiczarnik/zajecia/xml10/07-en... ·  · 2010-11-29XPath and XQuery Data Model I Theoretical basis of XPath, XSLT, ... functions

Conditional expression (XPath 2.0)

if CONDITIONthen RESULT1else RESULT2

I Effective Boolean Value of CONDITIONI One branch computed

Example

if details/pricethen

if details/price >= 1000then ’Insured mail’else ’Ordinary mail’

else ’No data’

Iteration through sequence (XPath 2.0)

for $VAR in SEQUENCEreturn RESULT

I VAR assigned subsequent values from SEQUENCEI RESULT computer in context where VAR is assigned current valueI overall result — (flattened) sequence of subsequent partial results

Examples

for $i in (1 to 10)return $i * $i

for $p in //personreturn concat($p/name, ’ ’, $p/surname)

Page 12: XPath and XQuery - Logowanie - Uniwersytet Warszawskiczarnik/zajecia/xml10/07-en... ·  · 2010-11-29XPath and XQuery Data Model I Theoretical basis of XPath, XSLT, ... functions

Sequence quantifiers (XPath 2.0)

some $VAR in SEQUENCEsatisfies CONDITION

every $VAR in SEQUENCEsatisfies CONDITION

I Effective Boolean Value of CONDITIONI Lazy evaluation allowedI Arbitrary order of items checking

Examples

some $i in (1 to 10) satisfies $i > 7

every $p in //person satisfies $p/surname

XPath paths

Absolute path/step/step ...

Relative pathstep/step ...

Step — fully expanded syntaxaxis::node-test [predicate1] ...[predicateN]

axis direction in document tree

node-test selecting nodes basing on kind, name, . . .

predicates arbitrary filtering expressions

Example

/descendant::team[attribute::id = ’3’]/child::person[1]/child::surname/child::text()

Page 13: XPath and XQuery - Logowanie - Uniwersytet Warszawskiczarnik/zajecia/xml10/07-en... ·  · 2010-11-29XPath and XQuery Data Model I Theoretical basis of XPath, XSLT, ... functions

AxisI child

I descendant

I parent

I ancestor

I following-sibling

I preceding-sibling

I following

I preceding

I attribute

I namespace

I self

I descendant-or-self

I ancestor-or-self

Axis

cite: www.GeorgeHernandez.com

Page 14: XPath and XQuery - Logowanie - Uniwersytet Warszawskiczarnik/zajecia/xml10/07-en... ·  · 2010-11-29XPath and XQuery Data Model I Theoretical basis of XPath, XSLT, ... functions

Node test

Kind of node

I node()

I text()

I document-node()

I element()

I attribute()

I processing-instruction(xml-stylesheet)

Kind and name of node

I element(person)

I attribute(id)

I processing-instruction(xml-stylesheet)

Node test (ctd.)

Kind and type of node

! XPath 2.0 schema aware processing

I element(*, studentType)

I element(person, studentType)

I attribute(*, xs:integer)

I attribute(id, xs:integer)

Name of node

! Kind of node default for current axis (element or attribute)

I person

I *

I pre:*

I *:person

Page 15: XPath and XQuery - Logowanie - Uniwersytet Warszawskiczarnik/zajecia/xml10/07-en... ·  · 2010-11-29XPath and XQuery Data Model I Theoretical basis of XPath, XSLT, ... functions

PredicatesI Evaluated for each node selected so far

(node becomes the context node).I Every predicate filters result sequence.I Depending on result type:

I number — compared to item position (counted from 1)I not number — Effective Boolean Value used

I Filter expressions — predicates outside paths (XSLT 2.0)

Examples

/child::staff/child::person[child::name = ’Patryk’]

child::person[child::name = ’Patryk’]/child::surname

//person[attribute::passport][3]

(1 to 10)[. mod 2 = 0]

Abbreviated SyntaxI child axis may be omittedI @ before name indicates attribute axisI . instead of self::node()I .. instead of parent::node()I // instead of /descendant-or-self::node()/

Example

.//object[@id = ’E4’]

self::node()/descendant-or-self::node()/child::object[attribute::id = ’E4’]

Page 16: XPath and XQuery - Logowanie - Uniwersytet Warszawskiczarnik/zajecia/xml10/07-en... ·  · 2010-11-29XPath and XQuery Data Model I Theoretical basis of XPath, XSLT, ... functions

Evaluation orderI From left to rightI Step by step

I //department/person[1]I (//department/person)[1]

I Predicate by predicateI //person[@manages and position() = 5]I //person[@manages][position() = 5]

XQuery — the query language for XML

Status

I XQuery 1.0 — W3C Recommendation, I 2007I Data model, functions and operators — shared with XPath 2.0I Formally: syntax defined from scratchI Practically: XPath syntax extension

Features

I Picking up data from XML documentsI Constructing new result nodesI Sorting, grouping, . . .I Custom functions definitionI Various output methods (XML, HTML, XHTML, text) — shared

with XSLT

Page 17: XPath and XQuery - Logowanie - Uniwersytet Warszawskiczarnik/zajecia/xml10/07-en... ·  · 2010-11-29XPath and XQuery Data Model I Theoretical basis of XPath, XSLT, ... functions

XQuery query structureI Header and bodyI Header consists of declarations:

I version declarationI importI flags and optionsI namespace declarationI variable or query parameterI function

Example

xquery version "1.0" encoding "utf-8";declare namespace foo = "http://example.org";declare variable $id as xs:string external;declare variable $doc := doc("example.xml");

$doc//foo:object[@id = $id]

FLWOR expressionI Acronym of For, Let, Where, Order by, ReturnI Replaces for from XPathI Motivation: SQL SELECT

Example

for $obj in doc("example.xml")/list/objectlet $prev := $obj/preceding-sibling::element()let $prev-name := $prev[1]/@namewhere $obj/@nameorder by $obj/@namereturn<div class="result">Object named {xs:string($obj/@name)}has {count($prev)} predecessors.The nearest predecessor name is{xs:string($prev-name)}.

</div>

Page 18: XPath and XQuery - Logowanie - Uniwersytet Warszawskiczarnik/zajecia/xml10/07-en... ·  · 2010-11-29XPath and XQuery Data Model I Theoretical basis of XPath, XSLT, ... functions

Node constructors — direct

XML document fragment within query

for $el in doc("example.xml")//* return<p style="color: blue">I have found an element.<?pi bla Bla ?><!--Comments and PIs also taken to result-->

</p>

Expressions nested within constructors — braces

<result>{for $el in doc("example.xml")//* return<elem depth="{count($el/ancestor::node())}">{name($el)}</elem>

}</result>

Node constructors — computed

Syntax

for $el in doc("example.xml")//* returnelement p {attribute style {"color: blue"},text { "I have found an element."},processing-instruction pi { "bla Bla" }comment { "Comments and PIs also taken to result" }}

Application example — dynamically computed name

<result>{for $el in doc("example.xml")//* returnelement {concat("elem-", name($el))} {attribute depth {count($el/ancestor::node())},text {name($el)}

} }</result>

Page 19: XPath and XQuery - Logowanie - Uniwersytet Warszawskiczarnik/zajecia/xml10/07-en... ·  · 2010-11-29XPath and XQuery Data Model I Theoretical basis of XPath, XSLT, ... functions

Custom function definitions

Example

declare functionlocal:twice($x)

{ 2 * $x };

Type declarations example

declare functionlocal:twice($x as xs:double)as xs:double

{ 2 * $x };

Type declarationsI Type declarations possible (not obligatory) for:

I variablesI function arguments and resultI also in XSLT 2.0

I Capabilities:I type nameI node kind | node() | item()I occurrence modifier (?, *, +, exactly one occurrence by default).

I Examples:I xs:doubleI element()I node()*I xs:integer?I item()+