presentation slides

Werbung
Relational Algebra: Mother Tongue
XQuery: Fluent
How to Compile XQuery Expressions into a Relational Algebra
Torsten Grust
Jens Teubner
University of Konstanz
Dept. of Computer and Information Science
Konstanz, Germany
June 21, 2004
T. Grust, J. Teubner
Relational Algebra: Mother Tongue—XQuery: Fluent
Motivation
Value Representation
Relational FLWORs
Relational XPath Back-Ends
XQuery is More than XPath
XQuery FLWOR Expressions
Relational XPath Back-Ends
Relational databases can efficiently back XPath evaluation.
Encode XML structure using a numbering scheme.
“XPath accelerator” with pre/post tuples.
Re-use existing database technology.
Storage management, index structures, query optimization.
Support XPath evaluation through tailor-made operators.
“Staircase join” exploits properties of the pre/post encoding.
“Holistic” join algorithms to process entire XPath expressions.
➠ Compile XQuery expressions into Relational Algebra.
T. Grust, J. Teubner
Relational Algebra: Mother Tongue—XQuery: Fluent
Motivation
Value Representation
Relational FLWORs
Relational XPath Back-Ends
XQuery is More than XPath
XQuery FLWOR Expressions
XQuery is More than XPath
Important XQuery features are not yet covered.
Iteration in XQuery FLOWR expressions.
Contrary to set-oriented processing in relational databases?
Construction of transient XML tree nodes.
Document table must be “extended” during query processing.
This talk addresses the first issue.
(The paper additionally covers the second.)
The ideas are orthogonal to efficient XPath evaluation.
T. Grust, J. Teubner
Relational Algebra: Mother Tongue—XQuery: Fluent
Motivation
Value Representation
Relational FLWORs
Relational XPath Back-Ends
XQuery is More than XPath
XQuery FLWOR Expressions
XQuery FLWOR Expressions
XQuery is built around a looping primitive, the for construct.
for $v in (x1 , x2 ,..., xn ) return e
≡
x
x
( e[ 1/$v ], e[ 2/$v ], . . . , e[xn/$v ] )
$v is successively bound to the values of xi .
The return expression e is evaluated for each binding.
XQuery is a functional-style language.
➠ It is sound to evaluate e for all bindings in parallel.
T. Grust, J. Teubner
Relational Algebra: Mother Tongue—XQuery: Fluent
Motivation
Value Representation
Relational FLWORs
The XQuery Data Model
Loop Lifting for Constant Subexpressions
The XQuery Data Model
XQuery’s basic data type is the sequence.
Any expression evaluates to an ordered sequence of items.
Sequences are always flat.
We may represent a sequence using a two-column table.
pos
1
2
3
item
"a"
"b"
"c"
Encode order in pos, the atom value in item.
(For now, we assume a polymorphic item type.)
T. Grust, J. Teubner
Relational Algebra: Mother Tongue—XQuery: Fluent
Motivation
Value Representation
Relational FLWORs
The XQuery Data Model
Loop Lifting for Constant Subexpressions
Loop Lifting for Constant Subexpressions
We extend our sequence encoding by the column iter that
accounts for the independent iterations.
iter
1
1
2
2
3
3
pos
1
2
1
2
1
2
item
10
20
10
20
10
20
Example:
for $v0 in (1,2,3) return 10
→ 10 appears in 3 iterations.
for $v0 in (1,2,3) return (10, 20)
→ (10, 20) appears in 3 iterations.
We refer to this as the loop lifted representation of a sequence.
T. Grust, J. Teubner
Relational Algebra: Mother Tongue—XQuery: Fluent
Motivation
Value Representation
Relational FLWORs
Deriving a Loop Lifted Value Representation
Nested XQuery Expressions
Mapping Between Scopes
Deriving a Loop Lifted Value Representation
We derive a compilation procedure that solely operates on
loop lifted sequences.
Example: Body of for $v in (10, 20, 30) return $v
iter
1
2
3
pos
1
1
1
item
10
20
30
Start with representation of (10, 20, 30).
Generate a new iteration for each value.
Each value forms a singleton sequence.
(pos = 1 for each tuple)
The row number operator % in our algebra generates unique iter s.
T. Grust, J. Teubner
Relational Algebra: Mother Tongue—XQuery: Fluent
Motivation
Value Representation
Relational FLWORs
Deriving a Loop Lifted Value Representation
Nested XQuery Expressions
Mapping Between Scopes
Nested XQuery Expressions
XQuery allows arbitrary expression nesting.
Example:

 for$v0 in (1,2) return
for $v0·0 in (10,20) return
s
 s0
s0·0 { ($v0 ,$v0·0 )
We need to map value representations between scopes.
Variables defined in surrounding scopes.
The expression result of the for expression.
T. Grust, J. Teubner
Relational Algebra: Mother Tongue—XQuery: Fluent
Motivation
Value Representation
Relational FLWORs
Deriving a Loop Lifted Value Representation
Nested XQuery Expressions
Mapping Between Scopes
Mapping Between Scopes
We capture expression nesting with help of a relation map.
Example: (previous slide)

 for$v0 in (1,2) return
for $v0·0 in (10,20) return
s
 s0
s0·0 { ($v0 ,$v0·0 )
outer
1
1
2
2
inner
1
2
3
4
map(0,0·0)
The relation lists corresponding iter values
of a for body and its surrounding scope.
Value mapping then renders into a join
with this relation.
We handle arbitrary nesting this way.
T. Grust, J. Teubner
Relational Algebra: Mother Tongue—XQuery: Fluent
Features and Requirements
Evaluation & Summary
Translatable Subset of XQuery Core
A Relational Algebra to Evaluate XQuery
Translatable Subset of XQuery Core
We support an XQuery subset that suffices to handle the
XMark benchmark set.
literals
arithmetics
builtin functions
variable bindings
iteration
conditionals
sequence construction
function calls
element construction
XPath steps
42, "foo", (), . . .
e1 + e2 , e1 - e2 , . . .
fn:sum(e), fn:count(e), fn:doc(uri)
let $v := e1 return e2
for $v at $p in e1 return e2
if p then e1 else e2
e1 , e2
f (e1 , e2 , ..., en )
element e1 { e2 }
e/α::ν
T. Grust, J. Teubner
Relational Algebra: Mother Tongue—XQuery: Fluent
Features and Requirements
Evaluation & Summary
Translatable Subset of XQuery Core
A Relational Algebra to Evaluate XQuery
A Relational Algebra to Evaluate XQuery
We generate a (almost) standard Relational Algebra.
πa1 :b1 ,...,an :bn
σa
.
∪
×
−
o
na=b
%b:ha1 ,...,an i/p
α,ν
ε
~b:ha1 ,...,an i
ab
count, . . .
projection/renaming
selection
disjoint union
Cartesian product
difference
equi-join
row numbering
XPath axis join
element construction
n-ary arithmetic/comparison operator ∗
literal table
count and other aggregation functions
T. Grust, J. Teubner
Relational Algebra: Mother Tongue—XQuery: Fluent
Features and Requirements
Evaluation & Summary
XMark on DB2
Summary
XMark on DB2
We implemented our translation for XMark queries in SQL.
1.1 MB
110 MB
1.1 GB
XMark 1
XMark 2
XMark 6
XMark 7
0.001
0.01
0.1
1.0
execution time [s]
T. Grust, J. Teubner
10
100
Relational Algebra: Mother Tongue—XQuery: Fluent
Features and Requirements
Evaluation & Summary
XMark on DB2
Summary
Summary
We propose a fully relational evaluation for XQuery.
A compilation procedure translates XQuery expressions into
Relational Algebra.
We can handle FLWOR expression, element construction and
others.
We can deal with arbitrary expression nesting.
Experiments with an SQL based DBMS are promising.
This work is part of our ongoing project “Pathfinder.”
Resulting algebra expressions, however, can get large. . .
T. Grust, J. Teubner
Relational Algebra: Mother Tongue—XQuery: Fluent
¶ (iter:outer,pos:pos1,item)
ROW# (pos1:<iter,pos>/outer)
|X| (iter=inner)
¶ (iter,item:pre)
BOX (item)
¶ (outer:iter,inner)
×
ROW# (inner:<iter,pos>)
TBL <("pos",int)>
SEL (res)
BOX (item)
1
ROW# (pos:<item>/iter)
= res:(level,zero)
/|child::open_auction
×
ELEM
doc
U
UNBOX (item)
TBL <("pre",node),("size",int),("level",int),("kind",int),("prop",str),("frag",int)>
×
¶ (iter)
×
TBL <("pos",int)>
¶ (iter1:iter)
/|child::open_auction
UNBOX (item)
¶ (iter,item)
SEL (item)
U
doc
¶ (iter,pos,item:res)
BOX (item)
= res:(item,item1)
/|child::open_auctions
UNBOX (item)
¶ (iter,item)
U
doc
TBL <("pre",node),("size",int),("level",int),("kind",int),("prop",str),("frag",int)>
BOX (item)
×
ROW# (pos:<item>/iter)
TBL <("pos",int)>
/|child::site
UNBOX (item)
¶ (iter,item)
doc
TBL <("pre",node),("size",int),("level",int),("kind",int),("prop",str),("frag",int)>
1,@14
¶ (iter,item)
doc
UNBOX (item)
TBL <("pre",node),("size",int),("level",int),("kind",int),("prop",str),("frag",int)>
¶ (iter,item)
ROW# (inner:<iter,pos>)
BOX (item)
BOX (item)
ROW# (pos:<item>/iter)
ROW# (pos:<item>/iter)
/|child::open_auction
UNBOX (item)
TBL <("pre",node),("size",int),("level",int),("kind",int),("prop",str),("frag",int)>
U
×
TBL <("pre",node),("size",int),("level",int),("kind",int),("prop",str),("frag",int)>
TBL <("pre",node),("size",int),("level",int),("kind",int),("prop",str),("frag",int)>
¶ (iter,item)
BOX (item)
doc
TBL <("pre",node),("size",int),("level",int),("kind",int),("prop",str),("frag",int)>
¶ (iter,item)
TBL <("pre",node),("size",int),("level",int),("kind",int),("prop",str),("frag",int)>
¶ (iter,item)
doc
×
1,@14
TBL <("iter",int)>
0
TBL <("pre",node),("size",int),("level",int),("kind",int),("prop",str),("frag",int)>
TBL <("iter",int)>
1,@14
doc
TBL <("pre",node),("size",int),("level",int),("kind",int),("prop",str),("frag",int)>
U
¶ (iter,item)
U
1
ROW# (inner:<iter,pos>)
¶ (iter,item)
0
doc
BOX (item)
/|child::open_auctions
U
UNBOX (item)
TBL <("pre",node),("size",int),("level",int),("kind",int),("prop",str),("frag",int)>
¶ (iter,item)
BOX (item)
BOX (item)
ROW# (pos:<item>/iter)
ROW# (pos:<item>/iter)
/|child::site
/|child::site
U
doc
UNBOX (item)
TBL <("pre",node),("size",int),("level",int),("kind",int),("prop",str),("frag",int)>
¶ (iter,item)
×
TBL <("iter",int)>
TBL <("pos",int),("item",node)>
1,@14
TBL <("iter",int)>
0
TBL <("pre",node),("size",int),("level",int),("kind",int),("prop",str),("frag",int)>
U
doc
TBL <("pre",node),("size",int),("level",int),("kind",int),("prop",str),("frag",int)>
U
doc
×
BOX (item)
1,@14
U
¶ (iter,item)
ROW# (pos:<item>/iter)
UNBOX (item)
1,@14
UNBOX (item)
BOX (item)
doc
TBL <("pos",int),("item",node)>
/|child::open_auction
TBL <("pre",node),("size",int),("level",int),("kind",int),("prop",str),("frag",int)>
/|child::open_auctions
¶ (iter,item)
BOX (item)
0
BOX (item)
ROW# (pos:<item>/iter)
U
doc
BOX (item)
TBL <("pos",int),("item",node)>
1,@14
U
doc
×
TBL <("iter",int)>
BOX (item)
ROW# (pos:<item>/iter)
UNBOX (item)
TBL <("pos",int),("item",node)>
UNBOX (item)
¶ (iter,item)
/|child::open_auction
TBL <("pre",node),("size",int),("level",int),("kind",int),("prop",str),("frag",int)>
TBL <("pre",node),("size",int),("level",int),("kind",int),("prop",str),("frag",int)>
¶ (iter:inner,item)
ROW# (pos:<item>/iter)
¶ (iter,item)
U
/|child::site
doc
TBL <("pos",int)>
ROW# (inner:<iter,pos>)
TBL <("pre",node),("size",int),("level",int),("kind",int),("prop",str),("frag",int)>
TBL <("pre",node),("size",int),("level",int),("kind",int),("prop",str),("frag",int)>
TBL <("pre",node),("size",int),("level",int),("kind",int),("prop",str),("frag",int)>
BOX (item)
BOX (item)
×
0
BOX (item)
TBL <("pos",int),("item",node)>
U
¶ (iter,item)
ROW# (pos:<item>/iter)
×
1
/|child::site
U
UNBOX (item)
BOX (item)
ROW# (pos:<item>/iter)
¶ (iter:inner,item)
UNBOX (item)
doc
ROW# (inner:<iter,pos>)
×
/|child::open_auctions
doc
/|child::open_auctions
UNBOX (item)
doc
TBL <("pre",node),("size",int),("level",int),("kind",int),("prop",str),("frag",int)>
ROW# (pos:<item>/iter)
1,1
U
U
doc
TBL <("pos",int),("item",int)>
/|child::bidder
TBL <("pos",int)>
ROW# (pos:<item>/iter)
UNBOX (item)
UNBOX (item)
¶ (iter,item)
doc
BOX (item)
¶ (iter:inner,item)
ROW# (pos:<item>/iter)
UNBOX (item)
ROW# (inner:<iter,pos>)
U
¶ (iter,item)
/|child::bidder
ROW# (pos:<item>/iter)
UNBOX (item)
U
¶ (iter,item)
¶ (iter:inner,item)
¶ (iter,item)
×
UNBOX (item)
BOX (item)
BOX (item)
TBL <("pre",node),("size",int),("level",int),("kind",int),("prop",str),("frag",int)>
BOX (item)
/|child::site
doc
1
TBL <("pos",int)>
TBL <("pre",node),("size",int),("level",int),("kind",int),("prop",str),("frag",int)>
BOX (item)
BOX (item)
ROW# (inner:<iter,pos>)
U
doc
UNBOX (item)
ROW# (pos:<item>/iter)
×
TBL <("pos",int)>
BOX (item)
/|child::site
TBL <("pos",int),("item",node)>
¶ (outer:iter,inner)
ROW# (pos:<item>/iter)
BOX (item)
¶ (iter,item)
ROW# (item:<inner>/outer)
/|child::bidder
¶ (iter,item)
/|child::open_auctions
UNBOX (item)
U
BOX (item)
ROW# (pos:<item>/iter)
UNBOX (item)
ROW# (pos:<item>/iter)
UNBOX (item)
×
¶ (iter)
U
ROW# (pos:<item>/iter)
/|child::open_auctions
¶ (iter1:iter,item1:item)
BOX (item)
BOX (item)
ROW# (pos:<item>/iter)
doc
×
¶ (iter:inner,item)
/|child::open_auction
doc
/|child::open_auction
1
/|child::bidder
1,@14
ROW# (pos:<item>/iter)
|X| (iter=iter1)
TBL <("pos",int)>
1,1
BOX (item)
TBL <("pos",int),("item",node)>
BOX (item)
¶ (iter:inner,item)
ROW# (inner:<iter,pos>)
×
0
ROW# (inner:<iter,pos>)
ROW# (inner:<iter,pos>)
U
¶ (iter,item)
1
1
/|child::open_auction
U
¶ (iter:inner,item)
UNBOX (item)
TBL <("pre",node),("size",int),("level",int),("kind",int),("prop",str),("frag",int)>
1
¶ (iter:inner,item)
1
¶ (iter,item)
¶ (iter:inner,item)
TBL <("pos",int)>
TBL <("pos",int)>
×
TBL <("pos",int)>
BOX (item)
0
doc
¶ (iter:inner,item)
doc
U
doc
TBL <("pre",node),("size",int),("level",int),("kind",int),("prop",str),("frag",int)>
TBL <("iter",int)>
UNBOX (item1)
×
BOX (item)
TBL <("pos",int),("item",int)>
ROW# (pos:<item>/iter)
U
ROW# (inner:<iter,pos>)
UNBOX (item)
TBL <("pos",int)>
×
= res:(item,item1)
BOX (item)
ROW# (pos:<item>/iter)
/|child::bidder
1
¶ (iter,item)
×
1
×
TBL <("iter",int)>
¶ (outer:iter,inner)
TBL <("pos",int)>
UNBOX (item)
×
¶ (iter)
BOX (item)
TBL <("pos",int),("item",node)>
UNBOX (item)
¶ (iter,item)
BOX (item)
ROW# (inner:<iter,pos>)
BOX (item)
0
UNBOX (item)
ROW# (item:<inner>/outer)
×
TBL <("iter",int)>
|X| (iter=iter1)
doc
TBL <("pre",node),("size",int),("level",int),("kind",int),("prop",str),("frag",int)>
¶ (iter,pos,item:res)
/|child::increase
U
¶ (iter,item)
doc
BOX (res)
¶ (iter1:iter,item1:item)
¶ (iter:inner,item)
1
U
TBL <("pre",node),("size",int),("level",int),("kind",int),("prop",str),("frag",int)>
ROW# (pos:<item>/iter)
UNBOX (item)
UNBOX (item)
U
¶ (iter,item)
BOX (item)
UNBOX (item1)
/|child::site
/|child::bidder
UNBOX (item)
UNBOX (item)
doc
TBL <("pre",node),("size",int),("level",int),("kind",int),("prop",str),("frag",int)>
ROW# (pos:<item>/iter)
ROW# (pos:<item>/iter)
¬ res:(item)
U
¶ (iter,item)
TBL <("iter",int),("pos",int),("item",*)>
doc
BOX (item)
BOX (item)
SEL (res)
/|child::text()
UNBOX (item)
BOX (res)
ROW# (pos:<item>/iter)
¶ (iter1:iter)
U
¶ (iter,item)
ROW# (inner:<iter,pos>)
|X| (iter=iter1)
ROW# (pos:<item>/iter)
UNBOX (item)
TBL <("pre",node),("size",int),("level",int),("kind",int),("prop",str),("frag",int)>
¶ (outer:iter,inner)
¶ (iter,pos,item)
BOX (item)
TBL <("pre",node),("size",int),("level",int),("kind",int),("prop",str),("frag",int)>
/|child::open_auctions
UNBOX (item)
U
¶ (iter,pos,item)
|X| (iter=iter1)
ROW# (pos:<item>/iter)
doc
BOX (item)
ROW# (pos:<item>/iter)
|X| (iter=inner)
1,"increase"
ROW# (inner:<iter,pos>)
BOX (item)
U
¶ (iter,item)
0
ROW# (pos1:<iter,pos>/outer)
TBL <("pos",int),("item",str)>
¶ (iter:inner,item)
1
UNBOX (item)
¶ (iter:outer,pos:pos1,item)
BOX (item)
UNBOX (item)
TBL <("zero",int)>
TBL <("pre",node),("size",int),("level",int),("kind",int),("prop",str),("frag",int)>
TBL <("pre",node),("size",int),("level",int),("kind",int),("prop",str),("frag",int)>
Herunterladen