Purely Relational XQuery

Werbung
Purely Relational XQuery
Database-Supported XML Processors
Torsten Grust
Technische Universität München
http://www-db.in.tum.de/~grust/
LUG Erding, Oct 2007
A Database Guy’s View of
XML, XPath, and XQuery
Prof. Dr. Torsten Grust
2
Technische Universität München
A Database Guy’s View of
XML, XPath, and XQuery
•
Build XML processors that do not stumble once
XML input volumes become sizable.
Prof. Dr. Torsten Grust
2
Technische Universität München
A Database Guy’s View of
XML, XPath, and XQuery
•
Build XML processors that do not stumble once
XML input volumes become sizable.
•
Wait! Sizable data volume...? Use Relational Databases!
Prof. Dr. Torsten Grust
2
Technische Universität München
A Database Guy’s View of
XML, XPath, and XQuery
•
Build XML processors that do not stumble once
XML input volumes become sizable.
•
Wait! Sizable data volume...? Use Relational Databases!
<?xml version=”1.0”?>
<root>
<elem att=”val”>
<!-- comment -->
text
</elem>
</root>
Prof. Dr. Torsten Grust
let $doc := doc(“in.xml”)
for $t in $doc//text()
return
count($t/../comment())
2
Technische Universität München
A Database Guy’s View of
XML, XPath, and XQuery
•
Build XML processors that do not stumble once
XML input volumes become sizable.
•
Wait! Sizable data volume...? Use Relational Databases!
<?xml version=”1.0”?>
<root>
<elem att=”val”>
<!-- comment -->
text
</elem>
</root>
Prof. Dr. Torsten Grust
let $doc := doc(“in.xml”)
for $t in $doc//text()
return
count($t/../comment())
2
Technische Universität München
A Database Guy’s View of
XML, XPath, and XQuery
•
Build XML processors that do not stumble once
XML input volumes become sizable.
•
Wait! Sizable data volume...? Use Relational Databases!
pre size level
1
1
10
2
1 20
3
1 30
⋮
Prof. Dr. Torsten Grust
⋮
let $doc := doc(“in.xml”)
for $t in $doc//text()
return
count($t/../comment())
⋮
2
Technische Universität München
A Database Guy’s View of
XML, XPath, and XQuery
•
Build XML processors that do not stumble once
XML input volumes become sizable.
•
Wait! Sizable data volume...? Use Relational Databases!
pre size level
1
1
10
2
1 20
3
1 30
⋮
Prof. Dr. Torsten Grust
⋮
SELECT
pre,size,
ROW_NUMBER (…)
FROM
t000
WHERE
value
GROUP BY iter,…
⋮
2
Technische Universität München
A Database Guy’s View of
XML, XPath, and XQuery
•
Build XML processors that do not stumble once
XML input volumes become sizable.
•
Wait! Sizable data volume...? Use Relational Databases!
F
L
E
H
S
E
OFF-TH
pre size level
1
1
10
2
1 20
3
1 30
⋮
Prof. Dr. Torsten Grust
⋮
SELECT
pre,size,
ROW_NUMBER (…)
FROM
t000
WHERE
value
GROUP BY iter,…
⋮
2
Technische Universität München
RDBMSs as Processors for
Non-Relational Languages
•
Relational databases contain the best understood and
most scalable query processors available today.
•
Can we use relational query engines as-is to efficiently
process non-relational languages?
Prof. Dr. Torsten Grust
3
Technische Universität München
RDBMSs as Processors for
Non-Relational Languages
•
Relational databases contain the best understood and
most scalable query processors available today.
•
Can we use relational query engines as-is to efficiently
process non-relational languages?
X
Prof. Dr. Torsten Grust
L
3
X
Technische Universität München
RDBMSs as Processors for
Non-Relational Languages
•
Relational databases contain the best understood and
most scalable query processors available today.
•
Can we use relational query engines as-is to efficiently
process non-relational languages?
X
L
R
X
R-1
SQL / Relational Algebra
Prof. Dr. Torsten Grust
3
Technische Universität München
RDBMSs as Processors for
Non-Relational Languages
•
Relational databases contain the best understood and
most scalable query processors available today.
•
Can we use relational query engines as-is to efficiently
process non-relational languages?
XQuery
R
R-1
SQL / Relational Algebra
Prof. Dr. Torsten Grust
3
Technische Universität München
Purely Relational XQuery
•
Find a relational representation of XQuery (XML, XPath)
that leverages the table-based processing model:
•
Exploit functionality already built into the database
system (do not invade the database kernel):
SQL idioms, OLAP extensions, partitioned B-Trees.
•
Run on top off-the-shelf relational database systems:
Prof. Dr. Torsten Grust
4
Technische Universität München
Purely Relational XQuery
•
Find a relational representation of XQuery (XML, XPath)
that leverages the table-based processing model:
•
Exploit functionality already built into the database
system (do not invade the database kernel):
SQL idioms, OLAP extensions, partitioned B-Trees.
•
Run on top off-the-shelf relational database systems:
DB2
(kxsystems)
®
IBM DB2 V9
SQL Server 2005 PostgreSQL
DB2 Version 9
Prof. Dr. Torsten Grust
4
kdb+
MonetDB
Technische Universität München
Pathfinder
pathfinder-xquery.org
Relational Database System
(Kernel)
Prof. Dr. Torsten Grust
5
Technische Universität München
Pathfinder
pathfinder-xquery.org
Pathfinder XQuery Compiler
XPath/XQuery Semantics
XML Encoding
Relational Database System
(Kernel)
Prof. Dr. Torsten Grust
5
Technische Universität München
Pathfinder
pathfinder-xquery.org
XQuery
Pathfinder XQuery Compiler
XPath/XQuery Semantics
XML Encoding
Relational Database System
(Kernel)
Prof. Dr. Torsten Grust
5
Technische Universität München
Pathfinder
pathfinder-xquery.org
XQuery
Pathfinder XQuery Compiler
XPath/XQuery Semantics
XML Encoding
SQL
Relational Database System
(Kernel)
Prof. Dr. Torsten Grust
5
Technische Universität München
Pathfinder
pathfinder-xquery.org
XQuery
Pathfinder XQuery Compiler
XPath/XQuery Semantics
XML Encoding
RA
SQL
Relational Database System
(Kernel)
Prof. Dr. Torsten Grust
5
Technische Universität München
Pathfinder & DB2
DB2
Against 110+ MB of XML
®
DB2 Version 9
for Linux, UNIX, an
Prof. Dr. Torsten Grust
6
Technische Universität München
Pathfinder & DB2
DB2
Against 110+ MB of XML
®
DB2 Version 9
for Linux, UNIX, an
Prof. Dr. Torsten Grust
6
Technische Universität München
Slow Motion Replay
XQuery
Pathfinder
SQL
DB2
Prof. Dr. Torsten Grust
®
DB2 Version 9
for Linux, UNIX, and Windows
7
Technische Universität München
Slow Motion Replay
XQuery
XQuery (XMark Q8)
let $a := doc("XMark-110mb.xml")
return
…
Pathfinder
<item person="{ $p/name/text() }">
{ count($ca) }
</item>
SQL
DB2
Prof. Dr. Torsten Grust
®
DB2 Version 9
for Linux, UNIX, and Windows
7
Technische Universität München
Slow Motion Replay
XQuery
SQL:1999 (no SQL/XML)
Pathfinder
SQL
DB2
Prof. Dr. Torsten Grust
®
DB2 Version 9
WITH
t0000 (iter,pre) AS
SELECT … FROM … WHERE …
t0001 (iter) AS
SELECT … FROM … WHERE …
SELECT
FROM
WHERE
ORDER BY
pre,size,kind,…
…
…
…
for Linux, UNIX, and Windows
7
Technische Universität München
Slow Motion Replay
XQuery
Pathfinder
SQL
DB2
Relational Encoding of Result
®
pre
size kind
5072509 2
1
5072510 0
2
5072511 0
3
⋮
Prof. Dr. Torsten Grust
DB2 Version 9
for Linux, UNIX, and Windows
7
Technische Universität München
Slow Motion Replay
XQuery
Pathfinder
SQL
DB2
Serialized XML Text
®
<?xml version="1.0" encoding="UTF-8"?>
<XQuery>
<item person=”Jixiang …”>0</item>
<item person=”Harpreet …”>0</item>
…
</XQuery>
Prof. Dr. Torsten Grust
DB2 Version 9
for Linux, UNIX, and Windows
7
Technische Universität München
Table Access Modes
Prof. Dr. Torsten Grust
8
Technische Universität München
Table Access Modes
•
Relational query engines derive much of their efficiency
from the simplicity of the table of tuples model.
A
Prof. Dr. Torsten Grust
8
B
Technische Universität München
Table Access Modes
•
Relational query engines derive much of their efficiency
from the simplicity of the table of tuples model.
1. Sequential scans:
read-ahead on disks,
prefetching CPU caches.
Prof. Dr. Torsten Grust
8
A
B
Technische Universität München
Table Access Modes
•
Relational query engines derive much of their efficiency
from the simplicity of the table of tuples model.
1. Sequential scans:
read-ahead on disks,
prefetching CPU caches.
Prof. Dr. Torsten Grust
8
A
B
Technische Universität München
Table Access Modes
•
Relational query engines derive much of their efficiency
from the simplicity of the table of tuples model.
1. Sequential scans:
read-ahead on disks,
prefetching CPU caches.
A
B
2. Index-based access:
B-Trees, range scans.
Prof. Dr. Torsten Grust
8
Technische Universität München
Table Access Modes
•
Relational query engines derive much of their efficiency
from the simplicity of the table of tuples model.
1. Sequential scans:
read-ahead on disks,
prefetching CPU caches.
A
B
2. Index-based access:
B-Trees, range scans.
Prof. Dr. Torsten Grust
8
Technische Universität München
Table Access Modes
•
Relational query engines derive much of their efficiency
from the simplicity of the table of tuples model.
1. Sequential scans:
read-ahead on disks,
prefetching CPU caches.
A
B
2. Index-based access:
B-Trees, range scans.
•
But: We will benefit only if we actually operate the
relational database in this bulk-oriented mode.
Prof. Dr. Torsten Grust
8
Technische Universität München
The Core of XQuery
Prof. Dr. Torsten Grust
9
Technische Universität München
The Core of XQuery
(e1,..,en)
Prof. Dr. Torsten Grust
ordered item sequences
9
Technische Universität München
The Core of XQuery
(e1,..,en)
for $x in e1 return e2
Prof. Dr. Torsten Grust
ordered item sequences
iteration
9
Technische Universität München
The Core of XQuery
(e1,..,en)
for $x in e1 return e2
let $x := e1 return e2
Prof. Dr. Torsten Grust
ordered item sequences
iteration
variable binding
9
Technische Universität München
The Core of XQuery
(e1,..,en)
for $x in e1 return e2
let $x := e1 return e2
if (e1) then e2 else e3
Prof. Dr. Torsten Grust
ordered item sequences
iteration
variable binding
conditionals
9
Technische Universität München
The Core of XQuery
(e1,..,en)
for $x in e1 return e2
let $x := e1 return e2
if (e1) then e2 else e3
<t> { e1 } </t>
Prof. Dr. Torsten Grust
ordered item sequences
iteration
variable binding
conditionals
XML node construction
9
Technische Universität München
The Core of XQuery
(e1,..,en)
for $x in e1 return e2
let $x := e1 return e2
if (e1) then e2 else e3
<t> { e1 } </t>
e1/child::t, e1/following::t
Prof. Dr. Torsten Grust
9
ordered item sequences
iteration
variable binding
conditionals
XML node construction
XPath location steps
Technische Universität München
The Core of XQuery
(e1,..,en)
for $x in e1 return e2
let $x := e1 return e2
if (e1) then e2 else e3
<t> { e1 } </t>
e1/child::t, e1/following::t
unordered { e1 }
Prof. Dr. Torsten Grust
9
ordered item sequences
iteration
variable binding
conditionals
XML node construction
XPath location steps
local order indifference
Technische Universität München
The Core of XQuery
(e1,..,en)
for $x in e1 return e2
let $x := e1 return e2
if (e1) then e2 else e3
<t> { e1 } </t>
e1/child::t, e1/following::t
unordered { e1 }
fn:doc(e1), fn:data(e1)
ordered item sequences
iteration
variable binding
conditionals
XML node construction
XPath location steps
local order indifference
built-in functions
+ Document order, node identity, dynamic typing, schema
validation, user-defined functions, ...
Prof. Dr. Torsten Grust
9
Technische Universität München
The Core of XQuery
(e1,..,en)
for $x in e1 return e2
let $x := e1 return e2
if (e1) then e2 else e3
<t> { e1 } </t>
e1/child::t, e1/following::t
unordered { e1 }
fn:doc(e1), fn:data(e1)
ordered item sequences
iteration
variable binding
conditionals
XML node construction
XPath location steps
local order indifference
built-in functions
+ Document order, node identity, dynamic typing, schema
validation, user-defined functions, ...
•
The XQuery–SQL gap appears substantial.
Prof. Dr. Torsten Grust
9
Technische Universität München
FP-Style Iteration
Prof. Dr. Torsten Grust
10
Technische Universität München
FP-Style Iteration
•
The XQuery for..in..return construct performs
side-effect free iteration:
for $x in (e1,..,en)
return b
Prof. Dr. Torsten Grust
10
Technische Universität München
FP-Style Iteration
•
The XQuery for..in..return construct performs
side-effect free iteration:
for $x in (e1,..,en)
return b
( b[e1/$x],..,b[en/$x] )
•
Individual iterations cannot interfere and may be
evaluated in any order (or in parallel).
Prof. Dr. Torsten Grust
10
Technische Universität München
•
Loop Lifting
“Relational loop unrolling”:
For any XQuery subexpression, generate a relational
plan that evaluates the expression in all iterations.
Prof. Dr. Torsten Grust
11
Technische Universität München
•
Loop Lifting
“Relational loop unrolling”:
For any XQuery subexpression, generate a relational
plan that evaluates the expression in all iterations.
for $x in (10,20,30)
return $x + 42
Prof. Dr. Torsten Grust
11
Technische Universität München
•
Loop Lifting
“Relational loop unrolling”:
For any XQuery subexpression, generate a relational
plan that evaluates the expression in all iterations.
for $x in (10,20,30)
return $x + 42
Prof. Dr. Torsten Grust
11
iter pos item
1
1 42
2
1 42
3
1 42
Technische Universität München
•
Loop Lifting
“Relational loop unrolling”:
For any XQuery subexpression, generate a relational
plan that evaluates the expression in all iterations.
for $x in (10,20,30)
return $x + 42
iter pos item
1
1 42
2
1 42
3
1 42
for $x in (10,20,30)
return $x + 42
Prof. Dr. Torsten Grust
11
Technische Universität München
•
Loop Lifting
“Relational loop unrolling”:
For any XQuery subexpression, generate a relational
plan that evaluates the expression in all iterations.
for $x in (10,20,30)
return $x + 42
iter pos item
1
1 42
2
1 42
3
1 42
for $x in (10,20,30)
return $x + 42
iter pos item
1
1 10
2
1 20
3
1 30
Prof. Dr. Torsten Grust
11
Technische Universität München
•
Loop Lifting
“Relational loop unrolling”:
For any XQuery subexpression, generate a relational
plan that evaluates the expression in all iterations.
for $x in (10,20,30)
return $x + 42
iter pos item
1
1 42
2
1 42
3
1 42
for $x in (10,20,30)
return $x + 42
iter pos item
1
1 10
2
1 20
3
1 30
let $x := (10,20,30)
return $x
Prof. Dr. Torsten Grust
11
Technische Universität München
•
Loop Lifting
“Relational loop unrolling”:
For any XQuery subexpression, generate a relational
plan that evaluates the expression in all iterations.
for $x in (10,20,30)
return $x + 42
iter pos item
1
1 42
2
1 42
3
1 42
for $x in (10,20,30)
return $x + 42
iter pos item
1
1 10
2
1 20
3
1 30
let $x := (10,20,30)
return $x
iter pos item
1
1 10
1
2 20
1
3 30
Prof. Dr. Torsten Grust
11
Technische Universität München
Loop Lifting
Prof. Dr. Torsten Grust
12
Technische Universität München
Loop Lifting
for $x in (10,20,30)
return $x + 42
Prof. Dr. Torsten Grust
12
Technische Universität München
Loop Lifting
for $x in (10,20,30)
return $x + 42
iter
1
2
3
pos item
1
10
1
20
1
30
Prof. Dr. Torsten Grust
iter1 pos1 item1
1
1
42
2
1
42
3
1
42
12
Technische Universität München
Loop Lifting
for $x in (10,20,30)
return $x + 42
iter
1
2
3
pos item
1
10
1
20
1
30
iter1 pos1 item1
1
1
42
2
1
42
3
1
42
⋈
iter = iter1
+
item2:(item,item1)
π
iter,pos,item2
Prof. Dr. Torsten Grust
12
Technische Universität München
Loop Lifting
for $x in (10,20,30)
return $x + 42
iter
1
2
3
pos item
1
10
1
20
1
30
iter1 pos1 item1
1
1
42
2
1
42
3
1
42
⋈
iter = iter1
iter
1
2
3
+
item2:(item,item1)
pos item iter1 pos1 item1
1
10
1
1
42
1
20
2
1
42
1
30
3
1
42
π
iter,pos,item2
Prof. Dr. Torsten Grust
12
Technische Universität München
Loop Lifting
for $x in (10,20,30)
return $x + 42
iter
1
2
3
pos item
1
10
1
20
1
30
iter1 pos1 item1
1
1
42
2
1
42
3
1
42
⋈
iter = iter1
iter
1
2
3
+
item2:(item,item1)
pos item iter1 pos1 item1
1
10
1
1
42
1
20
2
1
42
1
30
3
1
42
item2
52
62
72
π
iter,pos,item2
Prof. Dr. Torsten Grust
12
Technische Universität München
Loop Lifting
for $x in (10,20,30)
return $x + 42
iter
1
2
3
pos item
1
10
1
20
1
30
iter1 pos1 item1
1
1
42
2
1
42
3
1
42
⋈
iter = iter1
+
item2:(item,item1)
iter
1
2
3
π
iter,pos,item2
Prof. Dr. Torsten Grust
12
pos item2
1
52
1
62
1
72
Technische Universität München
Loop Lifting
Prof. Dr. Torsten Grust
13
Technische Universität München
Loop Lifting
for $x in (1,2,3,4)
return if ($x mod 2 = 0) then “even” else “odd”
Prof. Dr. Torsten Grust
13
Technische Universität München
Loop Lifting
for $x in (1,2,3,4)
return if ($x
$x mod 2 = 0)
0 then “even” else “odd”
iter
1
2
3
4
pos
1
1
1
1
item
false
true
false
true
Prof. Dr. Torsten Grust
13
Technische Universität München
Loop Lifting
for $x in (1,2,3,4)
return if ($x
$x mod 2 = 0)
0 then “even” else “odd”
σ
item
iter
1
2
3
4
pos
1
1
1
1
item
false
true
false
true
Prof. Dr. Torsten Grust
π
iter
×
pos item
1 “even”
⋅
⋃
pos item
1 “odd”
σ
¬ item
π
iter
×
13
Technische Universität München
Loop Lifting
for $x in (1,2,3,4)
return if ($x
$x mod 2 = 0)
0 then “even” else “odd”
iter
2
4
pos item
1 true
1 true
σ
item
iter
1
2
3
4
pos
1
1
1
1
item
false
true
false
true
Prof. Dr. Torsten Grust
π
iter
×
pos item
1 “even”
⋅
⋃
pos item
1 “odd”
σ
¬ item
π
iter
×
13
Technische Universität München
Loop Lifting
for $x in (1,2,3,4)
return if ($x
$x mod 2 = 0)
0 then “even” else “odd”
iter
2
4
σ
item
iter
1
2
3
4
pos
1
1
1
1
item
false
true
false
true
Prof. Dr. Torsten Grust
π
iter
×
pos item
1 “even”
⋅
⋃
pos item
1 “odd”
σ
¬ item
π
iter
×
13
Technische Universität München
Loop Lifting
for $x in (1,2,3,4)
return if ($x
$x mod 2 = 0)
0 then “even” else “odd”
iter
2
4
σ
item
iter
1
2
3
4
pos
1
1
1
1
item
false
true
false
true
Prof. Dr. Torsten Grust
π
iter
×
pos item
1 “even”
⋅
⋃
pos item
1 “odd”
σ
¬ item
pos item
1 “even”
1 “even”
π
iter
×
13
Technische Universität München
Loop Lifting
for $x in (1,2,3,4)
return if ($x
$x mod 2 = 0)
0 then “even” else “odd”
iter
1
σ
item
iter
1
2
3
4
pos
1
1
1
1
item
false
true
false
true
Prof. Dr. Torsten Grust
π
iter
×
pos item
1 “even”
π
iter
3
1
4
1
“odd”
“even”
⋅
⋃
pos item
1 “odd”
σ
¬ item
2
pos item
1 “odd”
1 “even”
×
13
Technische Universität München
Intermediate Code:
Relational Algebra
XQuery
Pathfinder XQuery Compiler
Relational Algebra
⋅ ⋯
σ π
⋃
⋈
Prof. Dr. Torsten Grust
14
Technische Universität München
Intermediate Code:
Relational Algebra
XQuery
Pathfinder XQuery Compiler
Relational Algebra
⋅ ⋯
σ π
⋃
⋈
•
Compiles into RA that
mimics capabilites of
SQL:1999 query engines.
•
RA dialect is ...
1. primitive
2. explicit
3. designed to be easily
analyzable
Prof. Dr. Torsten Grust
14
Technische Universität München
Intermediate Code:
Relational Algebra
XQuery
Pathfinder XQuery Compiler
Relational Algebra
⋅ ⋯
σ π
⋃
⋈
Code Gen Code Gen Code Gen
Prof. Dr. Torsten Grust
•
Compiles into RA that
mimics capabilites of
SQL:1999 query engines.
•
RA dialect is ...
1. primitive
2. explicit
3. designed to be easily
analyzable
14
Technische Universität München
Intermediate Code:
Relational Algebra
XQuery
Pathfinder XQuery Compiler
Relational Algebra
⋅ ⋯
σ π
⋃
⋈
Code Gen Code Gen Code Gen
Compiles into RA that
mimics capabilites of
SQL:1999 query engines.
•
RA dialect is ...
L
MI
Q
S
Q
L
1. primitive
2. explicit
3. designed to be easily
analyzable
(kxsystems)
Prof. Dr. Torsten Grust
•
14
Technische Universität München
Typical Plans ...
Prof. Dr. Torsten Grust
15
Technische Universität München
Typical Plans ...
Prof. Dr. Torsten Grust
15
Technische Universität München
Relational
XQuery Optimization
let $d := fn:doc(⋯)
return for $a in $d//a
for $b in $d//b
where $b/@c = $a/@d
return $b
Prof. Dr. Torsten Grust
16
Technische Universität München
Relational
XQuery Optimization
•
We use concepts in the relational domain to analyze
plans and to steer simplification and optimization.
•
Detect join-like XQuery expressions. Let XQuery’s
syntactic diversity not affect the detection:
let $d := fn:doc(⋯)
return for $a in $d//a
for $b in $d//b
where $b/@c = $a/@d
return $b
Prof. Dr. Torsten Grust
16
Technische Universität München
Relational
XQuery Optimization
•
We use concepts in the relational domain to analyze
plans and to steer simplification and optimization.
•
Detect join-like XQuery expressions. Let XQuery’s
syntactic diversity not affect the detection:
let $d := fn:doc(⋯)
return $d//b[@c = $d//a/@c]
Prof. Dr. Torsten Grust
16
Technische Universität München
Relational
XQuery Optimization
•
We use concepts in the relational domain to analyze
plans and to steer simplification and optimization.
•
Detect join-like XQuery expressions. Let XQuery’s
syntactic diversity not affect the detection:
let $d := fn:doc(⋯)
return $d//b[@c = $d//a/@c]
•
For both queries, the value-based XQuery join surfaces
as a multi-valued dependency in the algebraic plans.
Prof. Dr. Torsten Grust
16
Technische Universität München
¶ (iter, pos, item:res)
Rewriting XMark Q8
NOT (res:<item>)
@ (pos), val: #1
U
@ (item), val: false
DIFF
@ (item), val: true
DISTINCT
¶ (iter:outer)
¶ (iter)
ROW# (pos1:<sort, pos>/outer)
|X| (iter = inner)
@ (item), val: 1
@ (pos), val: #1
¶ (iter)
SEL (item)
¶ (iter, pos, item:res)
= (res:<item, item1>)
|X| (iter = iter1)
¶ (iter, pos, item:cast)
¶ (iter1:iter, item1:cast)
CAST (cast:<item>), type: str
CAST (cast:<item>), type: str
¶ (iter:inner, pos, item)
@ (pos), val: #1
|X| (iter = outer)
¶ (iter:inner, item)
@ (pos), val: #1
¶ (outer:iter, sort:pos, inner)
NUMBER (inner)
¶ (iter:inner, item)
¶ (iter:outer, pos:pos1, item)
ROW# (pos1:<sort, pos>/outer)
|X| (iter = inner)
¶ (iter, pos, item:cast)
CAST (cast:<item>), type: uA
¶ (outer:iter, sort:pos, inner)
¶ (iter, pos, item:res)
access attribute value (res:<item>)
@ (pos), val: #1
¶ (iter:inner, item)
NUMBER (inner)
ROW# (pos:<item>/iter)
DISTINCT
¶ (iter, item)
ROW# (pos:<item>/iter)
/| attribute::attribute id { atomic* } (iter, item)
¶ (iter:inner, item)
|X| (iter = outer)
¶ (iter:inner, pos, item)
¶ (outer:iter, sort:pos, inner)
NUMBER (inner)
¶ (iter:outer, pos:pos1, item)
ROW# (pos1:<sort, pos>/outer)
|X| (iter = inner)
¶ (iter, pos, item:cast)
CAST (cast:<item>), type: uA
¶ (outer:iter, sort:pos, inner)
¶ (iter, pos, item:res)
access attribute value (res:<item>)
@ (pos), val: #1
¶ (iter:inner, item)
NUMBER (inner)
ROW# (pos:<item>/iter)
DISTINCT
¶ (iter, item)
ROW# (pos:<item>/iter)
/| attribute::attribute person { atomic* } (iter, item)
¶ (iter, item)
ROW# (pos:<item>/iter)
DISTINCT
¶ (iter, item)
ROW# (pos:<item>/iter)
/| child::element buyer { item* } (iter, item)
¶ (iter, item)
@ (pos), val: #1
|X| (iter = outer)
¶ (iter:inner, item)
¶ (outer:iter, sort:pos, inner)
ROW# (pos1:<sort, pos>/outer)
NUMBER (inner)
|X| (iter = inner)
ROW# (pos:<item>/iter)
¶ (iter, pos, item:res)
DISTINCT
access textnode content (res:<item>)
¶ (outer:iter, sort:pos, inner)
¶ (iter, item)
@ (pos), val: #1
ROW# (pos:<item>/iter)
¶ (iter:inner, item)
/| child::element closed_auction { item* } (iter, item)
NUMBER (inner)
¶ (iter, item)
ROW# (pos:<item>/iter)
ROW# (pos:<item>/iter)
DISTINCT
DISTINCT
¶ (iter, item)
¶ (iter, item)
ROW# (pos:<item>/iter)
ROW# (pos:<item>/iter)
/| child::text (iter, item)
/| child::element closed_auctions { item* } (iter, item)
¶ (iter, item)
¶ (iter, item)
ROW# (pos:<item>/iter)
ROW# (pos:<item>/iter)
DISTINCT
DISTINCT
¶ (iter, item)
¶ (iter, item)
ROW# (pos:<item>/iter)
ROW# (pos:<item>/iter)
/| child::element name { item* } (iter, item)
/| child::element site { item* } (iter, item)
¶ (iter, item)
¶ (iter:inner, item)
@ (pos), val: #1
|X| (iter = outer)
¶ (iter:inner, item)
¶ (outer:iter, sort:pos, inner)
NUMBER (inner)
ROW# (pos:<item>/iter)
DISTINCT
¶ (iter, item)
ROW# (pos:<item>/iter)
/| child::element person { item* } (iter, item)
¶ (iter, item)
ROW# (pos:<item>/iter)
Prof. Dr. Torsten Grust
17
Technische Universität München
DISTINCT
¶ (iter, item)
ROW# (pos:<item>/iter)
/| child::element people { item* } (iter, item)
¶ (iter, item)
ROW# (pos1:<sort>)
Rewriting XMark Q8
•
|X| (iter = inner)
FRAG_UNION
FRAGs
@ (pos), val: #1
ROOTS
ELEM (iter, item:<iter, item><iter, pos, item>)
ELEM_TAG
¶ (iter, pos:pos1, item)
@ (item), val: item
ROW# (pos1:<ord>/iter)
U
@ (ord), val: #1
FRAG_UNION
¶ (iter, pos, item:res)
FRAG_UNION
@ (pos), val: #1
ROOTS
FRAGs
Rewriting phases:
@ (ord), val: #2
¶ (iter, item:res)
FRAGs
ROOTS
ATTR (res:<item, item1>)
TEXT (res:<cast>)
|X| (iter = iter1)
CAST (cast:<item>), type: str
@ (item), val: person
¶ (iter1:iter, item1:item)
U
fn:string_join
@ (pos), val: #1
@ (item), val: 0
@ (item), val: " "
DIFF
¶ (iter:inner)
¶ (iter)
¶ (iter:outer, pos:pos1, item)
COUNT (item:/iter)
¶ (iter:outer)
1. Constant propagation and
projection pushdown,
|X| (iter = inner)
¶ (iter)
|X| (iter = iter1)
¶ (iter1:iter)
SEL (item)
¶ (iter, item:res)
NOT (res:<item>)
U
@ (item), val: false
@ (item), val: true
DIFF
DISTINCT
¶ (iter:outer)
¶ (iter:inner)
|X| (iter = inner)
¶ (iter)
SEL (item)
¶ (iter, item:res)
NOT (res:<item>)
U
@ (item), val: false
@ (item), val: true
DIFF
DISTINCT
¶ (iter:outer)
¶ (iter:inner)
|X| (iter = inner)
¶ (iter)
SEL (item)
¶ (iter, item:res)
= (res:<item, item1>)
|X| (iter = iter1)
¶ (iter, item:cast)
CAST (cast:<item>), type: str
¶ (iter1:iter, item1:cast)
¶ (iter:inner, item)
CAST (cast:<item>), type: str
|X| (iter = outer)
¶ (iter:inner, item)
¶ (outer:iter, inner)
¶ (iter:inner, item)
NUMBER (inner)
¶ (iter:outer, item)
|X| (iter = inner)
¶ (iter, item:cast)
CAST (cast:<item>), type: uA
¶ (outer:iter, inner)
¶ (iter, item:res)
access attribute value (res:<item>)
¶ (iter:inner, item)
NUMBER (inner)
/| attribute::attribute id { atomic* } (iter, item)
¶ (iter:inner, item)
|X| (iter = outer)
¶ (iter:inner, item)
¶ (outer:iter, inner)
NUMBER (inner)
¶ (iter:outer, item)
|X| (iter = inner)
¶ (iter, item:cast)
CAST (cast:<item>), type: uA
¶ (outer:iter, inner)
¶ (iter, item:res)
access attribute value (res:<item>)
¶ (iter:inner, item)
ROW# (pos1:<sort>/outer)
|X| (iter = inner)
NUMBER (inner)
/| attribute::attribute person { atomic* } (iter, item)
¶ (iter, pos, item:res)
/| child::element buyer { item* } (iter, item)
access textnode content (res:<item>)
@ (pos), val: #1
¶ (outer:iter, sort:pos, inner)
¶ (iter:inner, item)
NUMBER (inner)
|X| (iter = outer)
¶ (iter:inner, item)
¶ (outer:iter, inner)
NUMBER (inner)
/| child::element closed_auction { item* } (iter, item)
/| child::element closed_auctions { item* } (iter, item)
ROW# (pos:<item>/iter)
/| child::element site { item* } (iter, item)
/| child::text (iter, item)
¶ (iter:inner, item)
/| child::element name { item* } (iter, item)
¶ (iter:inner, item)
¶ (outer:iter, sort:pos, inner)
NUMBER (inner)
ROW# (pos:<item>)
/| child::element person { item* } (iter, item)
/| child::element people { item* } (iter, item)
Prof. Dr. Torsten Grust
17
Technische Universität München
/| child::element site { item* } (iter, item)
FRAG_UNION
EMPTY_FRAG
ROOTS
FRAGs
DOC
@ (item), val: "auctionG.xml"
Rewriting XMark Q8
•
Rewriting phases:
SERIALIZE
1. Constant propagation and
projection pushdown,
¶ (item, pos)
ROW# (pos:<pos1>)
¶ (pos1, item)
FRAG_UNION
|X| (iter = iter1)
FRAGs
ROOTS
ELEM (iter1, item:<iter1, item><iter1, pos, item>)
ELEM_TAG
¶ (iter1, item, pos)
@ (item), val: item
ROW# (pos:<pos1>/iter1)
U
@ (pos1), val: #1
FRAG_UNION
FRAG_UNION
ROOTS
¶ (iter1, item)
FRAGs
ROOTS
FRAGs
TEXT (item:<item2>)
ATTR (item:<item2, item1>)
CAST (item2:<item1>), type: str
@ (item2), val: person
U
fn:string_join
2. functional dependency and
data flow analysis,
@ (pos1), val: #2
¶ (iter1, item)
@ (item1), val: 0
@ (item1), val: " "
DIFF
¶ (iter1:iter)
¶ (iter1, item1, pos)
¶ (iter1)
COUNT (item1:/iter1)
ROW# (pos:<pos1>/iter1)
¶ (iter1)
¶ (iter1, pos1, item1)
|X| (iter = iter2)
¶ (iter:iter2)
DISTINCT
¶ (item, item1, iter1, iter2:iter)
¶ (iter2)
SEL (item)
¶ (iter2, item)
= (item:<item3, item4>)
¶ (iter2, item3, item4)
CAST (item4:<item>), type: str
CAST (item3:<item2>), type: str
¶ (iter2, item2, item)
CAST (item:<item4>), type: uA
access attribute value (item4:<item3>)
¶ (item3, iter2, item2)
|X| (iter1 = iter3)
¶ (iter1:iter3, item3)
/| attribute::attribute id { atomic* } (iter3, item3)
¶ (iter2, item2:item1, iter3:iter1)
¶ (item3:item, iter3:iter1)
NUMBER (iter1)
¶ (item, iter2, item1)
CAST (item1:<item3>), type: uA
access attribute value (item3:<item2>)
¶ (item2, item, iter2)
|X| (iter = iter3)
¶ (iter:iter3, item2)
/| attribute::attribute person { atomic* } (iter3, item2)
access textnode content (item1:<item>)
¶ (item:item1, iter2:iter, iter3:iter)
/| child::element buyer { item* } (iter3, item2)
ROW# (pos1:<item>/iter1)
¶ (item2:item, iter3:iter)
/| child::text (iter1, item)
NUMBER (iter)
/| child::element name { item* } (iter1, item)
¶ (item:item1, iter1:iter)
¶ (item1, iter1:iter)
NUMBER (iter)
¶ (item)
ROW# (pos1:<item1>)
¶ (item1)
/| child::element person { item* } (iter, item1)
/| child::element closed_auction { item* } (iter, item)
/| child::element people { item* } (iter, item1)
/| child::element closed_auctions { item* } (iter, item)
/| child::element site { item* } (iter, item1)
FRAG_UNION
/| child::element site { item* } (iter, item)
@ (iter), val: #1
EMPTY_FRAG
¶ (item1:item)
FRAGs
ROOTS
DOC
TBL: (iter | item)
[#1,"auctionG.xml"]
Prof. Dr. Torsten Grust
17
Technische Universität München
Rewriting XMark Q8
•
Rewriting phases:
1. Constant propagation and
projection pushdown,
SERIALIZE
¶ (item, pos)
ROW# (pos:<pos1>)
¶ (pos1, item)
FRAG_UNION
2. functional dependency and
data flow analysis,
FRAGs
|X| (iter = iter1)
ROOTS
ELEM (iter1, item:<iter1, item><iter1, pos, item>)
ELEM_TAG
¶ (iter1, item, pos)
@ (item), val: item
ROW# (pos:<pos1>/iter1)
U
@ (pos1), val: #1
FRAG_UNION
FRAG_UNION
ROOTS
¶ (iter1, item)
FRAGs
ROOTS
FRAGs
TEXT (item:<item2>)
ATTR (item:<item2, item1>)
CAST (item2:<item1>), type: str
@ (item2), val: person
U
fn:string_join
@ (item1), val: 0
@ (item1), val: " "
DIFF
¶ (iter1:iter)
¶ (iter1, item1, pos)
¶ (iter1)
COUNT (item1:/iter1)
ROW# (pos:<pos1>/iter1)
3. algebraic XQuery join detection,
@ (pos1), val: #2
¶ (iter1, item)
¶ (iter1)
¶ (iter1, pos1, item1)
DISTINCT
¶ (iter, iter1)
|X| (item1 = item)
access textnode content (item1:<item>)
¶ (iter, item1)
ROW# (pos1:<item>/iter1)
access attribute value (item1:<item>)
/| child::text (iter1, item)
¶ (iter1, item)
access attribute value (item:<item1>)
¶ (item, iter:iter2)
/| child::element name { item* } (iter1, item)
¶ (item1, iter1:iter2)
/| attribute::attribute person { atomic* } (iter2, item)
¶ (item:item1, iter1:iter)
/| attribute::attribute id { atomic* } (iter2, item1)
/| child::element buyer { item* } (iter2, item)
NUMBER (iter)
¶ (item1, iter2:iter)
¶ (item:item1, iter2:iter)
ROW# (pos1:<item1>)
NUMBER (iter)
¶ (item1)
¶ (item1)
/| child::element person { item* } (iter, item1)
/| child::element closed_auction { item* } (iter, item1)
/| child::element people { item* } (iter, item1)
/| child::element closed_auctions { item* } (iter, item1)
/| child::element site { item* } (iter, item1)
FRAG_UNION
EMPTY_FRAG
ROOTS
FRAGs
DOC
TBL: (iter | item1)
[#1,"auctionG.xml"]
Prof. Dr. Torsten Grust
17
Technische Universität München
Rewriting XMark Q8
•
Rewriting phases:
1. Constant propagation and
projection pushdown,
2. functional dependency and
data flow analysis,
SERIALIZE
ROOTS
twig (iter, item)
ELEM (iter, item)
fcns
Attach (item), val: item
fcns
ATTR (iter, item, item1)
TEXT (iter, item)
Attach (item), val: person
Project (iter, item)
CAST (item:<item1>), type: str
UNION
Attach (item1), val: 0
DIFF
Project (iter)
COUNT (item1:/iter)
fn:string_join
access textnode content (item1:<item>)
Project (iter)
ThetaJoin
(item eq item1)
Attach (item1), val: " "
Project (iter, item)
3. algebraic XQuery join detection,
Project (item)
/|+ child::text (item:item1)
level=5
access attribute value (item1:<item>)
/|+ child::element name { item* } (item1:iter)
level=4
Project (iter, item1)
access attribute value (item:<item1>)
/|+ attribute::attribute id { atomic* } (item:iter)
level=4
Project (item1)
/|+ attribute::attribute person { atomic* } (item1:item)
level=5
Project (iter)
/|+ child::element person { item* } (iter:item)
level=3
Project (item)
/|+ child::element buyer { item* } (item:item1)
level=4
Project (item)
Project (item1)
/|+ child::element people { item* } (item:item1)
level=2
/|+ child::element closed_auction { item* } (item1:item)
level=3
Project (item)
/|+ child::element closed_auctions { item* } (item:item1)
level=2
Project (item1)
4. XPath join graph isolation,
Prof. Dr. Torsten Grust
17
/|+ child::element site { item* } (item1:item)
level=1
ROOTS
DOC (iter, item)
TBL: (iter | item)
[#1,"auctionG.xml"]
Technische Universität München
Rewriting XMark Q8
•
Rewriting phases:
1. Constant propagation and
projection pushdown,
2. functional dependency and
data flow analysis,
SERIALIZE
FRAG UNION
FRAGs
ROOTS
twig (iter, item)
ELEM (iter, item)
fcns
fcns
@item:’item’
TEXT (iter, item)
ATTR (iter, item, item1)
@item:’person’
πiter ,item1
nil
πiter ,item
CAST (item:¡item1¿), type: str
∪
@item1 :0
\
πiter
COUNTitem1 :()/iter
πiter
3. algebraic XQuery join detection,
!
"
item = item1
πitem
πiter ,item1
item:item1
πitem1
item1 :item
attribute(person)
GPS = 960, level = 5
descendant::text()
GPS = 802, level = 5
item1 :item
item:iter
item1 :item
attribute(id)
GPS = 799, level = 4
item:iter
πitem
πiter
descendant::element(buyer)
GPS = 959, level = 4
item:item1
πitem1 :item
∪
EMPTY FRAG
4. XPath join graph isolation,
FRAGs
item:iter
descendant::element(person)
GPS = 798, level = 3
πitem
ROOTS
DOC (iter, item)
TBL: (iter — item)
5. data guide exploitation (“Node GPS”).
Pathfinder & IBM DB2 V9, 115 MB XML input data
< 1 sec
Prof. Dr. Torsten Grust
17
Technische Universität München
Rewriting XMark Q8
•
πiter
!
"
Rewriting phases:
item=item1
1. Constant propagation and
projection pushdown,
πiter ,item1
πitem
2. functional dependency and
data flow analysis,
item:item1
πitem1
item1 :item
item1 :item
3. algebraic XQuery join detection,
4. XPath join graph isolation,
descendant::text()
GPS = 802, level = 5
item:iter
attribute(person)
GPS = 960, level = 5
item1 :item
a
GPS = 79
item:iter
5. data guide exploitation (“Node GPS”).
πitem
descendant::element(buyer)
Pathfinder & IBM DB2 V9, 115 MB XML input
data
GPS = 959, level = 4
< 1 sec
item:item1
Prof. Dr. Torsten Grust
17
∪
πitem1Technische
:item
πitem
Universität München
upstream
Can DB2 Cope?
®
SORT(33) 373.32
HSJOIN(35) 373.32
NLJOIN(37) 75.02
HSJOIN(43) 298.3
IXSCAN(41) 50.01
NLJOIN(45) 149.15
IDX_GUIDE_PRE
XMARK_1.DOC
IXSCAN(53) 50.01
IDX_GUIDE_PRE_VALUE
XMARK_1.DOC
Prof. Dr. Torsten Grust
DB2 Version 9
NLJOIN(
for Linux, UNIX, and Windows
NLJOIN(47) 75.02
IXSCAN(51) 50.01
NLJOIN(
IXSCAN(49) 50.01
IDX_GUIDE_PRE_PRE_PLUS_SIZE
IDX_VALUE_KIND_PRE_SIZE
XMARK_1.DOC
XMARK_1.DOC
18
IXSCAN(63) 50.0
IDX_GUIDE_PRE_VAL
XMARK_1.DOC
Technische Universität München
upstream
Can DB2 Cope?
®
SORT(33) 373.32
HSJOIN(35) 373.32
NLJOIN(37) 75.02
HSJOIN(43) 298.3
IXSCAN(41) 50.01
NLJOIN(45) 149.15
IDX_GUIDE_PRE
XMARK_1.DOC
IXSCAN(53) 50.01
IDX_GUIDE_PRE_VALUE
XMARK_1.DOC
Prof. Dr. Torsten Grust
DB2 Version 9
NLJOIN(
for Linux, UNIX, and Windows
NLJOIN(47) 75.02
IXSCAN(51) 50.01
NLJOIN(
IXSCAN(49) 50.01
IDX_GUIDE_PRE_PRE_PLUS_SIZE
IDX_VALUE_KIND_PRE_SIZE
XMARK_1.DOC
XMARK_1.DOC
18
IXSCAN(63) 50.0
IDX_GUIDE_PRE_VAL
XMARK_1.DOC
Technische Universität München
upstream
Can DB2 Cope?
®
SORT(33) 373.32
HSJOIN(35) 373.32
NLJOIN(37) 75.02
HSJOIN(43) 298.3
IXSCAN(41) 50.01
NLJOIN(45) 149.15
IDX_GUIDE_PRE
XMARK_1.DOC
IXSCAN(53) 50.01
IDX_GUIDE_PRE_VALUE
XMARK_1.DOC
Prof. Dr. Torsten Grust
DB2 Version 9
NLJOIN(
for Linux, UNIX, and Windows
NLJOIN(47) 75.02
IXSCAN(51) 50.01
NLJOIN(
IXSCAN(49) 50.01
IDX_GUIDE_PRE_PRE_PLUS_SIZE
IDX_VALUE_KIND_PRE_SIZE
XMARK_1.DOC
XMARK_1.DOC
18
IXSCAN(63) 50.0
IDX_GUIDE_PRE_VAL
XMARK_1.DOC
Technische Universität München
upstream
Can DB2 Cope?
®
SORT(33) 373.32
HSJOIN(35) 373.32
NLJOIN(37) 75.02
HSJOIN(43) 298.3
IXSCAN(41) 50.01
NLJOIN(45) 149.15
IDX_GUIDE_PRE
XMARK_1.DOC
IXSCAN(53) 50.01
IDX_GUIDE_PRE_VALUE
XMARK_1.DOC
Prof. Dr. Torsten Grust
DB2 Version 9
NLJOIN(
for Linux, UNIX, and Windows
NLJOIN(47) 75.02
IXSCAN(51) 50.01
NLJOIN(
IXSCAN(49) 50.01
IDX_GUIDE_PRE_PRE_PLUS_SIZE
IDX_VALUE_KIND_PRE_SIZE
XMARK_1.DOC
XMARK_1.DOC
18
IXSCAN(63) 50.0
IDX_GUIDE_PRE_VAL
XMARK_1.DOC
Technische Universität München
upstream
Can DB2 Cope?
®
SORT(33) 373.32
HSJOIN(35) 373.32
NLJOIN(37) 75.02
HSJOIN(43) 298.3
IXSCAN(41) 50.01
NLJOIN(45) 149.15
IDX_GUIDE_PRE
XMARK_1.DOC
IXSCAN(53) 50.01
IDX_GUIDE_PRE_VALUE
XMARK_1.DOC
Prof. Dr. Torsten Grust
DB2 Version 9
NLJOIN(
for Linux, UNIX, and Windows
NLJOIN(47) 75.02
IXSCAN(51) 50.01
NLJOIN(
IXSCAN(49) 50.01
IDX_GUIDE_PRE_PRE_PLUS_SIZE
IDX_VALUE_KIND_PRE_SIZE
XMARK_1.DOC
XMARK_1.DOC
18
IXSCAN(63) 50.0
IDX_GUIDE_PRE_VAL
XMARK_1.DOC
Technische Universität München
upstream
Can DB2 Cope?
®
SORT(33) 373.32
HSJOIN(35) 373.32
NLJOIN(37) 75.02
HSJOIN(43) 298.3
IXSCAN(41) 50.01
NLJOIN(45) 149.15
IDX_GUIDE_PRE
XMARK_1.DOC
IXSCAN(53) 50.01
IDX_GUIDE_PRE_VALUE
XMARK_1.DOC
Prof. Dr. Torsten Grust
DB2 Version 9
NLJOIN(
for Linux, UNIX, and Windows
NLJOIN(47) 75.02
IXSCAN(51) 50.01
NLJOIN(
IXSCAN(49) 50.01
IDX_GUIDE_PRE_PRE_PLUS_SIZE
IDX_VALUE_KIND_PRE_SIZE
XMARK_1.DOC
XMARK_1.DOC
18
IXSCAN(63) 50.0
IDX_GUIDE_PRE_VAL
XMARK_1.DOC
Technische Universität München
upstream
Can DB2 Cope?
®
SORT(33) 373.32
HSJOIN(35) 373.32
NLJOIN(37) 75.02
HSJOIN(43) 298.3
IXSCAN(41) 50.01
NLJOIN(45) 149.15
IDX_GUIDE_PRE
XMARK_1.DOC
IXSCAN(53) 50.01
IDX_GUIDE_PRE_VALUE
XMARK_1.DOC
Prof. Dr. Torsten Grust
DB2 Version 9
NLJOIN(
for Linux, UNIX, and Windows
NLJOIN(47) 75.02
IXSCAN(51) 50.01
NLJOIN(
IXSCAN(49) 50.01
IDX_GUIDE_PRE_PRE_PLUS_SIZE
IDX_VALUE_KIND_PRE_SIZE
XMARK_1.DOC
XMARK_1.DOC
18
IXSCAN(63) 50.0
IDX_GUIDE_PRE_VAL
XMARK_1.DOC
Technische Universität München
upstream
Can DB2 Cope?
®
SORT(33) 373.32
*) Indexes avised by DB2 itself
HSJOIN(35) 373.32
NLJOIN(37) 75.02
HSJOIN(43) 298.3
IXSCAN(41) 50.01
NLJOIN(45) 149.15
IDX_GUIDE_PRE
XMARK_1.DOC
IXSCAN(53) 50.01
*)
IDX_GUIDE_PRE_VALUE
XMARK_1.DOC
DB2 Version 9
for Linux, UNIX, and Windows
NLJOIN(47) 75.02
NLJOIN(
IXSCAN(51) 50.01
IDX_GUIDE_PRE_PRE_PLUS_SIZE
XMARK_1.DOC
Prof. Dr. Torsten Grust
NLJOIN(
IXSCAN(49) 50.01
*)
IDX_VALUE_KIND_PRE_SIZE
XMARK_1.DOC
18
IXSCAN(63) 50.0
*)
IDX_GUIDE_PRE_VAL
XMARK_1.DOC
Technische Universität München
Order Indifference
let $warning := <p>Do <em>not</em> press button,
computer will <em>explode!</em></p>
return fn:distinct-doc-order( for $x in $warning//*
return $x/text()
)
Prof. Dr. Torsten Grust
19
Technische Universität München
Order Indifference
•
Order in XML and XQuery processing is essential:
let $warning := <p>Do <em>not</em> press button,
computer will <em>explode!</em></p>
return fn:distinct-doc-order( for $x in $warning//*
return $x/text()
)
Prof. Dr. Torsten Grust
19
Technische Universität München
Order Indifference
•
Order in XML and XQuery processing is essential:
let $warning := <p>Do <em>not</em> press button,
computer will <em>explode!</em></p>
return fn:distinct-doc-order( for $x in $warning//*
return $x/text()
)
Do not press button, computer will explode!
Prof. Dr. Torsten Grust
19
Technische Universität München
Order Indifference
•
Order in XML and XQuery processing is essential:
let $warning := <p>Do <em>not</em> press button,
computer will <em>explode!</em></p>
return for $x in $warning//*
return $x/text()
Do press button, computer will not explode!
Prof. Dr. Torsten Grust
19
Technische Universität München
Order Indifference
•
Order in XML and XQuery processing is essential:
let $warning := <p>Do <em>not</em> press button,
computer will <em>explode!</em></p>
return for $x in $warning//*
return $x/text()
Do press button, computer will not explode!
•
Order-awareness thus is often
deeply wired into XQuery processors.
•
Supporting constructs like
unordered { e1 } is challenging.
Prof. Dr. Torsten Grust
19
Technische Universität München
Order Indifference
•
Order in XML and XQuery processing is essential:
let $warning := <p>Do <em>not</em> press button,
computer will <em>explode!</em></p>
return for $x in $warning//*
return $x/text()
Do press button, computer will not explode!
•
Order-awareness thus is often
deeply wired into XQuery processors.
SERIALIZE
¶ (item, pos)
•
ROW# (pos:<pos1>)
Supporting constructs like
unordered { e1 } is challenging.
¶ (pos1, item)
FRAG_UNION
FRAGs
|X| (iter = iter1)
ROOTS
ELEM (iter1, item:<iter1, item><iter1, pos, item>)
ELEM_TAG
¶ (iter1, item, pos)
@ (item), val: item
ROW# (pos:<pos1>/iter1)
U
@ (pos1), val: #1
Prof. Dr. Torsten Grust
19
@ (pos1), val: #2
Technische Universität München
FRAG_UNION
FRAG_UNION
FRAGs
¶ (iter1, item)
ROOTS
¶ (iter1, item)
FRAGs
ROOTS
TEXT (item:<item2>)
Order Indifference
•
Order in XML and XQuery processing is essential:
let $warning := <p>Do <em>not</em> press button,
computer will <em>explode!</em></p>
return for $x in $warning//*
return $x/text()
Do press button, computer will not explode!
•
Order-awareness thus is often
deeply wired into XQuery processors.
π iter,item
SERIALIZE
¶ (item, pos)
•
ROW# (pos:<pos1>)
Supporting constructs like
unordered { e1 } is challenging.
¶ (pos1, item)
FRAG_UNION
FRAGs
|X| (iter = iter1)
ROOTS
ELEM (iter1, item:<iter1, item><iter1, pos, item>)
ELEM_TAG
¶ (iter1, item, pos)
@ (item), val: item
ROW# (pos:<pos1>/iter1)
U
@ (pos1), val: #1
Prof. Dr. Torsten Grust
19
@ (pos1), val: #2
Technische Universität München
FRAG_UNION
FRAG_UNION
FRAGs
¶ (iter1, item)
ROOTS
¶ (iter1, item)
FRAGs
ROOTS
TEXT (item:<item2>)
More Purely Relational XQuery
Prof. Dr. Torsten Grust
20
Technische Universität München
More Purely Relational XQuery
•
Declarative XQuery debugging with “time travel”.
•
Users observe expressions
and may traverse iterations
forward/backward.
Prof. Dr. Torsten Grust
20
Technische Universität München
More Purely Relational XQuery
•
Declarative XQuery debugging with “time travel”.
•
Users observe expressions
and may traverse iterations
forward/backward.
Prof. Dr. Torsten Grust
20
iter pos item
1
2
3
4
…
Technische Universität München
More Purely Relational XQuery
•
•
Declarative XQuery debugging with “time travel”.
•
Users observe expressions
and may traverse iterations
forward/backward.
iter pos item
1
2
3
4
…
Dependable cardinality estimates at XQuery
subexpression level (# items per iteration and
overall contribution).
for $x in $doc//x
return if (e1) then e2 else e3
Prof. Dr. Torsten Grust
20
Technische Universität München
More Purely Relational XQuery
•
•
Declarative XQuery debugging with “time travel”.
•
iter pos item
Users observe expressions
and may traverse iterations
forward/backward.
1
2
3
4
…
Dependable cardinality estimates at XQuery
subexpression level (# items per iteration and
overall contribution).
for $x in $doc//x
4.3
return if (e1) then e2 else e3
8
Prof. Dr. Torsten Grust
20
2
Technische Universität München
Atomization Indexes
<a>
<b>
<c>Same<d>shirt</d></c>
different
</b>
day
</a>
Prof. Dr. Torsten Grust
21
Technische Universität München
Atomization Indexes
<a>
<b>
<c>Same<d>shirt</d></c>
different
</b>
day
</a>
Prof. Dr. Torsten Grust
21
Technische Universität München
Atomization Indexes
a
<a>
<b>
<c>Same<d>shirt</d></c>
different
</b>
day
</a>
b
day
c
Same
d
different
shirt
Prof. Dr. Torsten Grust
21
Technische Universität München
Atomization Indexes
a
<a>
<b>
<c>Same<d>shirt</d></c>
different
</b>
day
</a>
node
a
b
c
atom1
atom2
atom3
b
day
c
Same
atom4
d
different
shirt
d
Prof. Dr. Torsten Grust
21
Technische Universität München
Atomization Indexes
a
<a>
<b>
<c>Same<d>shirt</d></c>
different
</b>
day
</a>
node
a
b
c
atom1
atom2
atom3
b
day
c
Same
atom4
d
different
shirt
d
Prof. Dr. Torsten Grust
21
Technische Universität München
Atomization Indexes
a
<a>
<b>
<c>Same<d>shirt</d></c>
different
</b>
day
</a>
node
a
b
c
atom1
atom2
atom3
b
day
c
Same
atom4
d
different
shirt
Same
d
shirt
different
day
Prof. Dr. Torsten Grust
21
Technische Universität München
Atomization Indexes
a
<a>
<b>
<c>Same<d>shirt</d></c>
different
</b>
day
</a>
node
a
b
c
d
atom1
atom2
atom3
b
day
c
Same
atom4
d
different
shirt
Same
shirt
shirt
different
day
Prof. Dr. Torsten Grust
21
Technische Universität München
Atomization Indexes
a
<a>
<b>
<c>Same<d>shirt</d></c>
different
</b>
day
</a>
node atom1
atom2
a Same
shirt
b Same
shirt
c Same
shirt
Same
d shirt
shirt
different
day
Prof. Dr. Torsten Grust
b
day
c
Same
atom3
atom4
different day
different
21
d
different
shirt
Technische Universität München
Atomization Indexes
a
<a>
<b>
<c>Same<d>shirt</d></c>
different
</b>
day
</a>
node atom1
atom2
a Same
shirt
b Same
shirt
c Same
shirt
Same
d shirt
shirt
different
day
Prof. Dr. Torsten Grust
b
day
c
Same
atom3
atom4
different day
different
21
d
different
shirt
dict
atom
Technische Universität München
Atomization Indexes
a
<a>
<b>
<c>Same<d>shirt</d></c>
different
</b>
day
</a>
node atom1
atom2
a Sameδ1
shirt
b Sameδ1
shirt
c Sameδ1
shirt
Sameδ1
d shirt
shirt
different
day
Prof. Dr. Torsten Grust
b
day
c
Same
atom3
atom4
different day
different
21
d
different
shirt
dict
atom
δ1 Same
Technische Universität München
Atomization Indexes
a
<a>
<b>
<c>Same<d>shirt</d></c>
different
</b>
day
</a>
b
c
Same
d
different
shirt
node atom1 atom2 atom3 atom4
a
δ1
δ2
δ3
δ4
b
δ1
δ2
δ3
c
δ1
δ2
δ1
d
δ2
δ2
δ3
δ4
Prof. Dr. Torsten Grust
day
dict
δ1
δ2
δ3
δ4
21
atom
Same
shirt
different
day
Technische Universität München
Atomization Indexes
a
<a>
<b>
<c>Same<d>shirt</d></c>
different
</b>
day
</a>
b
c
Same
d
different
shirt
node atom1 atom2 atom3 atom4
a
δ51
δ2
δ3
δ4
b
δ51
δ2
δ3
c
δ51
δ2
δ1
d
δ2
δ2
δ3
δ4
Prof. Dr. Torsten Grust
day
dict
δ1
δ2
δ3
δ4
δ5
21
atom
Same
shirt
different
day
Same
shirt
Technische Universität München
Atomization Indexes
a
<a>
<b>
<c>Same<d>shirt</d></c>
different
</b>
day
</a>
b
c
Same
d
different
shirt
node atom1 atom2 atom3 atom4
a
δ651
δ2
δ3
δ4
b
δ651
δ2
δ3
c
δ51
δ2
δ1
d
δ2
δ2
δ3
δ4
Prof. Dr. Torsten Grust
day
dict
δ1
δ2
δ3
δ4
δ5
δ6
21
atom
Same
shirt
different
day
δ1
δ2
δ5
δ3
Technische Universität München
Atomization Indexes
a
<a>
<b>
<c>Same<d>shirt</d></c>
different
</b>
day
</a>
b
c
Same
d
different
shirt
node atom1 atom2
a
δ6
δ4
b
δ6
c
δ5
δ1
d
δ2
δ2
δ3
δ4
Prof. Dr. Torsten Grust
day
dict
δ1
δ2
δ3
δ4
δ5
δ6
21
atom
Same
shirt
different
day
δ1
δ2
δ5
δ3
Technische Universität München
Relational Encodings for XML
•
XML documents represent ordered, unranked trees
(nodes of 7 kinds).
•
Relational XQuery processing calls for an adequate
relational encoding of such trees:
1. Schema-oblivious
(XQuery constructs arbitrary XML fragments),
2. accessible for the query processor and index support
( 107 nodes in typical GB-range XML instances).
Prof. Dr. Torsten Grust
22
Technische Universität München
Relational Encodings for XML
•
XML documents represent ordered, unranked trees
(nodes of 7 kinds).
•
Relational XQuery processing calls for an adequate
relational encoding of such trees:
1. Schema-oblivious
(XQuery constructs arbitrary XML fragments),
2. accessible for the query processor and index support
( 107 nodes in typical GB-range XML instances).
doc
Prof. Dr. Torsten Grust
✘
X
1
“<a>b<c/>d<e/><f>g</f></a>”
2
…
22
Technische Universität München
Pre/Postorder Encoding
<a>
<b><c><d/>e</c></b>
<f><!--g-->
<h><i/><j/></h>
</f>
</a>
Prof. Dr. Torsten Grust
23
Technische Universität München
Pre/Postorder Encoding
a
<a>
<b><c><d/>e</c></b>
<f><!--g-->
<h><i/><j/></h>
</f>
</a>
Prof. Dr. Torsten Grust
b
f
h
c
d
23
g
e
i
j
Technische Universität München
Pre/Postorder Encoding
0a9
a
<a>
<b><c><d/>e</c></b>
<f><!--g-->
<h><i/><j/></h>
</f>
</a>
Prof. Dr. Torsten Grust
1b3
b
5f8
f
7h7
h
2c2
c
3d0
d
23
6g4
g
4e1
e
8i5
i
j
9j6
Technische Universität München
Pre/Postorder Encoding
0a9
a
<a>
<b><c><d/>e</c></b>
<f><!--g-->
<h><i/><j/></h>
</f>
</a>
1b3
b
5f8
f
7h7
h
2c2
c
3d0
d
6g4
g
4e1
e
8i5
i
j
9j6
pre post node
a
0
9
1
3
b
2
2
c
3
0
d
4
1
e
5
8
f
6
4
g
7
7
h
8
5
i
9
6
j
Prof. Dr. Torsten Grust
23
Technische Universität München
Pre/Postorder Encoding
0a9
a
<a>
<b><c><d/>e</c></b>
<f><!--g-->
<h><i/><j/></h>
</f>
</a>
1b3
b
7h7
h
2c2
c
23
6g4
g
4e1
e
3d0
d
document order
Prof. Dr. Torsten Grust
5f8
f
8i5
i
j
9j6
pre post node
a
0
9
1
3
b
2
2
c
3
0
d
4
1
e
5
8
f
6
4
g
7
7
h
8
5
i
9
6
j
Technische Universität München
Pre/Postorder Encoding
0a9
a
<a>
<b><c><d/>e</c></b>
<f><!--g-->
<h><i/><j/></h>
</f>
</a>
1b3
b
7h7
h
2c2
c
23
6g4
g
4e1
e
3d0
d
document order
Prof. Dr. Torsten Grust
5f8
f
8i5
i
j
9j6
pre post node kind
a
0
9
elem
1
3
b
elem
2
2
c
elem
3
0
d
elem
4
1
e
text
5
8
f
elem
6
4
g comm
7
7
h
elem
8
5
i
elem
9
6
j
elem
Technische Universität München
Pre/Postorder Encoding
0a9
a
<a>
<b><c><d/>e</c></b>
<f><!--g-->
<h><i/><j/></h>
</f>
</a>
1b3
b
7h7
h
2c2
c
23
6g4
g
4e1
e
3d0
d
document order
Prof. Dr. Torsten Grust
5f8
f
8i5
i
j
9j6
pre post node kind size level
a
0
0
9
elem
9
1
3
b
elem
3
1
2
2
c
elem
2
2
3
0
d
elem
0
3
4
1
e
text
0
3
5
8
f
elem
4
1
6
4
g comm 0
2
7
7
h
elem
2
2
8
5
i
elem
0
3
9
6
j
elem
0
3
Technische Universität München
Pre/Postorder Encoding
0a9
a
<a>
<b><c><d/>e</c></b>
<f><!--g-->
<h><i/><j/></h>
</f>
</a>
1b3
b
7h7
h
2c2
c
6g4
g
4e1
e
3d0
d
document order
Prof. Dr. Torsten Grust
5f8
f
23
pre
0
1
2
3
4
5
6
7
8
9
8i5
i
node
a
b
c
d
e
f
g
h
i
j
j
9j6
kind size level
0
elem
9
elem
3
1
elem
2
2
elem
0
3
text
0
3
elem
4
1
comm 0
2
elem
2
2
elem
0
3
elem
0
3
Technische Universität München
Pre/Postorder Encoding
0a9
a
<a>
<b><c><d/>e</c></b>
<f><!--g-->
<h><i/><j/></h>
</f>
</a>
1b3
b
6g4
g
4e1
e
3d0
d
document order
5
Prof. Dr. Torsten Grust
7h7
h
2c2
c
post
5
5f8
f
pre
23
pre
0
1
2
3
4
5
6
7
8
9
8i5
i
node
a
b
c
d
e
f
g
h
i
j
j
9j6
kind size level
0
elem
9
elem
3
1
elem
2
2
elem
0
3
text
0
3
elem
4
1
comm 0
2
elem
2
2
elem
0
3
elem
0
3
Technische Universität München
Pre/Postorder Encoding
0a9
a
<a>
<b><c><d/>e</c></b>
<f><!--g-->
<h><i/><j/></h>
</f>
</a>
1b3
b
f
4e1
e
document order
j
b c
d
Prof. Dr. Torsten Grust
g
e
5
6g4
g
3d0
d
h
5
7h7
h
2c2
c
post
a
5f8
f
i
pre
23
pre
0
1
2
3
4
5
6
7
8
9
8i5
i
node
a
b
c
d
e
f
g
h
i
j
j
9j6
kind size level
0
elem
9
elem
3
1
elem
2
2
elem
0
3
text
0
3
elem
4
1
comm 0
2
elem
2
2
elem
0
3
elem
0
3
Technische Universität München
B-Trees Accelerate
XPath Location Steps
•
The XPath axes ancestor, descendant, preceding,
following partition any XML document:
Prof. Dr. Torsten Grust
24
Technische Universität München
B-Trees Accelerate
XPath Location Steps
•
The XPath axes ancestor, descendant, preceding,
following partition any XML document:
a
f
h
j
5
b
Prof. Dr. Torsten Grust
g
c
i
e
d
5
24
Technische Universität München
B-Trees Accelerate
XPath Location Steps
•
The XPath axes ancestor, descendant, preceding,
following partition any XML document:
a
f
h
j
5
b
Prof. Dr. Torsten Grust
g
c
i
e
d
5
24
Technische Universität München
B-Trees Accelerate
XPath Location Steps
•
The XPath axes ancestor, descendant, preceding,
following partition any XML document:
a
f
h
j
5
b
Prof. Dr. Torsten Grust
g
c
i
e
d
5
24
Technische Universität München
B-Trees Accelerate
XPath Location Steps
•
The XPath axes ancestor, descendant, preceding,
following partition any XML document:
a
f
h
j
5
b
Prof. Dr. Torsten Grust
g
c
i
e
d
5
24
Technische Universität München
B-Trees Accelerate
XPath Location Steps
•
The XPath axes ancestor, descendant, preceding,
following partition any XML document:
a
f
h
j
5
b
Prof. Dr. Torsten Grust
g
c
i
e
d
5
24
Technische Universität München
B-Trees Accelerate
XPath Location Steps
•
The XPath axes ancestor, descendant, preceding,
following partition any XML document:
B-Tree
a
f
h
j
5
b
Prof. Dr. Torsten Grust
g
c
i
e
d
5
24
pre post node
a
0
9
1
3
b
2
2
c
3
0
d
4
1
e
5
8
f
6
4
g
7
7
h
8
5
i
9
6
j
Technische Universität München
B-Trees Accelerate
XPath Location Steps
•
The XPath axes ancestor, descendant, preceding,
following partition any XML document:
B-Tree
a
f
h
j
5
b
Prof. Dr. Torsten Grust
g
c
i
e
d
5
24
pre post node
a
0
9
1
3
b
2
2
c
3
0
d
4
1
e
5
8
f
6
4
g
7
7
h
8
5
i
9
6
j
Technische Universität München
Partitioned B-Trees
Prof. Dr. Torsten Grust
25
Technische Universität München
Partitioned B-Trees
•
Use low selectivity key prefixes to partition the
XML nodes in a B-Tree according to various criteria:
Prof. Dr. Torsten Grust
25
Technische Universität München
Partitioned B-Trees
•
Use low selectivity key prefixes to partition the
XML nodes in a B-Tree according to various criteria:
1. level (support XPath child axis)
Prof. Dr. Torsten Grust
25
B-Tree
Technische Universität München
Partitioned B-Trees
•
Use low selectivity key prefixes to partition the
XML nodes in a B-Tree according to various criteria:
1. level (support XPath child axis)
B-Tree
level = 1 level = 2
Prof. Dr. Torsten Grust
25
level = 3
Technische Universität München
Partitioned B-Trees
•
Use low selectivity key prefixes to partition the
XML nodes in a B-Tree according to various criteria:
1. level (support XPath child axis)
B-Tree
2. element tag name
3. path to node (“Node GPS”)
level = 1 level = 2
•
level = 3
Given a Pathfinder-generated workload,
the index advisor of IBM DB2 V9 proposes
such partitionings automatically.
Prof. Dr. Torsten Grust
25
Technische Universität München
Injecting More Tree Awareness
•
In XQuery, an XPath location step originates in a
sequence of context nodes.
Duplicate nodes, out of order results
(XPath semantics!), wasted work.
Prof. Dr. Torsten Grust
26
Technische Universität München
Injecting More Tree Awareness
•
In XQuery, an XPath location step originates in a
sequence of context nodes.
Duplicate nodes, out of order results
(XPath semantics!), wasted work.
c3
c4
c1
c2
Prof. Dr. Torsten Grust
26
Technische Universität München
Injecting More Tree Awareness
•
In XQuery, an XPath location step originates in a
sequence of context nodes.
Duplicate nodes, out of order results
(XPath semantics!), wasted work.
c3
c4
c1
c2
Prof. Dr. Torsten Grust
26
Technische Universität München
Injecting More Tree Awareness
•
In XQuery, an XPath location step originates in a
sequence of context nodes.
Duplicate nodes, out of order results
(XPath semantics!), wasted work.
c3
c4
c1
c2
Prof. Dr. Torsten Grust
26
Technische Universität München
Injecting More Tree Awareness
•
In XQuery, an XPath location step originates in a
sequence of context nodes.
Duplicate nodes, out of order results
(XPath semantics!), wasted work.
c3
c4
c1
c2
Prof. Dr. Torsten Grust
26
Technische Universität München
Injecting More Tree Awareness
•
In XQuery, an XPath location step originates in a
sequence of context nodes.
Duplicate nodes, out of order results
(XPath semantics!), wasted work.
c3
c4
c1
c2
Prof. Dr. Torsten Grust
26
Technische Universität München
Injecting More Tree Awareness
•
In XQuery, an XPath location step originates in a
sequence of context nodes.
Duplicate nodes, out of order results
(XPath semantics!), wasted work.
Pathfinder XQuery Compiler
c3
c4
XPath/XQuery Semantics
XML Encoding
c1
c2
Prof. Dr. Torsten Grust
26
Technische Universität München
Injecting More Tree Awareness
•
In XQuery, an XPath location step originates in a
sequence of context nodes.
Duplicate nodes, out of order results
(XPath semantics!), wasted work.
Pathfinder XQuery Compiler
c3
c4
staircase join
XPath/XQuery Semantics
XML Encoding
c1
c2
Prof. Dr. Torsten Grust
26
Technische Universität München
Staircase Join: Context Pruning
•
XPath following axis:
c3
c4
c1
c2
Prof. Dr. Torsten Grust
27
Technische Universität München
Staircase Join: Context Pruning
•
XPath following axis:
c1
c2
Prof. Dr. Torsten Grust
27
Technische Universität München
Staircase Join: Context Pruning
•
XPath following axis:
c1
c1
c2
c2
Prof. Dr. Torsten Grust
27
Technische Universität München
Staircase Join: Context Pruning
•
XPath following axis:
c1
c2
c2
•
Index scan yields duplicate free result in document order.
Prof. Dr. Torsten Grust
27
Technische Universität München
•
Staircase Join: Skipping
XPath descendant axis:
c3
c4
c1
c2
Prof. Dr. Torsten Grust
28
Technische Universität München
•
Staircase Join: Skipping
XPath descendant axis:
c3
c1
Prof. Dr. Torsten Grust
28
Technische Universität München
•
Staircase Join: Skipping
XPath descendant axis:
c3
c1
scan
Prof. Dr. Torsten Grust
28
Technische Universität München
•
Staircase Join: Skipping
XPath descendant axis:
c3
c1
v
v
c1
scan
Prof. Dr. Torsten Grust
28
Technische Universität München
•
Staircase Join: Skipping
XPath descendant axis:
c3
c1
v
v
c1
scan
•
skip
scan
Index scan yields duplicate free result in document order.
Prof. Dr. Torsten Grust
28
Technische Universität München
MonetDB/XQuery
Prof. Dr. Torsten Grust
29
Technische Universität München
MonetDB/XQuery
•
MonetDB: Extensible relational database kernel,
optimized for query processing close to the CPU.
•
Full vertical fragmentation
(column store).
Prof. Dr. Torsten Grust
29
pre post node
a
0
9
1
3
b
2
2
c
3
0
d
4
1
e
5
8
f
6
4
g
7
7
h
8
5
i
9
6
j
Technische Universität München
MonetDB/XQuery
•
MonetDB: Extensible relational database kernel,
optimized for query processing close to the CPU.
•
Full vertical fragmentation
(column store).
Prof. Dr. Torsten Grust
29
pre post
0
9
1
3
2
2
3
0
4
1
5
8
6
4
7
7
8
5
9
6
pre node
a
0
1
b
2
c
3
d
4
e
5
f
6
g
7
h
8
i
9
j
Technische Universität München
MonetDB/XQuery
•
MonetDB: Extensible relational database kernel,
optimized for query processing close to the CPU.
•
Full vertical fragmentation
(column store).
Prof. Dr. Torsten Grust
29
pre post
0
9
1
3
2
2
3
0
4
1
5
8
6
4
7
7
8
5
9
6
pre node
a
0
1
b
2
c
3
d
4
e
5
f
6
g
7
h
8
i
9
j
Technische Universität München
MonetDB/XQuery
•
MonetDB: Extensible relational database kernel,
optimized for query processing close to the CPU.
•
•
Full vertical fragmentation
(column store).
Pathfinder + MonetDB = MonetDB/XQuery.
•
•
Implementation of XQuery 1.0.
Evaluates queries against GB-range
XML instances in interactive time.
pre post
0
9
1
3
2
2
3
0
4
1
5
8
6
4
7
7
8
5
9
6
pre node
a
0
1
b
2
c
3
d
4
e
5
f
6
g
7
h
8
i
9
j
+ XQuery Update, XQuery RPC ( execute at uri { e } ),
support for multi-dimensional XML, ...
Prof. Dr. Torsten Grust
29
Technische Universität München
“DB2’s XML support
doesn’t beat its relational self”.
Prof. Dr. Torsten Grust
30
Technische Universität München
“DB2’s XML support
doesn’t beat its relational self”.
•
Pathfinder builds on 30+ years of development of
relational database technology.
Prof. Dr. Torsten Grust
30
Technische Universität München
“DB2’s XML support
doesn’t beat its relational self”.
•
Pathfinder builds on 30+ years of development of
relational database technology.
•
Outperforms the built-in DB2 V9 pureXML® engine,
especially for large XML input instances.
Prof. Dr. Torsten Grust
30
Technische Universität München
“DB2’s XML support
doesn’t beat its relational self”.
•
Pathfinder builds on 30+ years of development of
relational database technology.
•
•
Outperforms the built-in DB2 V9 pureXML® engine,
especially for large XML input instances.
Covers other data-intensive languages:
1. SQL/XML
Prof. Dr. Torsten Grust
30
Technische Universität München
“DB2’s XML support
doesn’t beat its relational self”.
•
Pathfinder builds on 30+ years of development of
relational database technology.
•
•
Outperforms the built-in DB2 V9 pureXML® engine,
especially for large XML input instances.
Covers other data-intensive languages:
1. SQL/XML
2. LINQ + other “Nested Loop” languages.
Prof. Dr. Torsten Grust
30
Technische Universität München
[email protected]
http://www-db.in.tum.de/~grust/
Prof. Dr. Torsten Grust
31
Technische Universität München
Prof. Dr. Torsten Grust
32
Technische Universität München
Herunterladen