Decorrelation of Nested SQL Queries in the Presence of Disjunction

Werbung
Decorrelation of Nested SQL Queries
in the Presence of Disjunction
Matthias Brantner & Norman May
University Mannheim
msb|[email protected]
Matthias Brantner, 14. Januar 2005
Decorrelation of Nested SQL Queries
Motivation
Motivation & Outline
Conjunctive Nested Query:
● Motivation
● Outline
Preliminaries
Decorrelation (N & J)
SELECT
FROM
WHERE
Decorrelation (JA)
*
R
R.A1 IN (SELECT S.A4
FROM S
WHERE R.A2 = S.A3);
Evaluation
SF2
SF1
1
5
10
1
0.01
0.04
0.06
5
0.04
0.06
0.09
10
0.07
0.09
0.50
Conclusions & Future Work
Disjunctive Nested Query:
SELECT
FROM
WHERE
Matthias Brantner, 14. Januar 2005
*
R
R.A1 IN (SELECT S.A4
FROM S
WHERE R.A2 = S.A3)
or R.A4 > 1500
SF2
SF1
1
5
10
1
10.1
51.3
102
5
50.6
260
521
10
100
523
1044
Decorrelation of Nested SQL Queries
Outline
Motivation & Outline
● Motivation
■
Nesting in SQL
■
Decorrelation (Conjunctive and Disjunctive)
■
Evaluation
■
Conclusion & Future Work
● Outline
Preliminaries
Decorrelation (N & J)
Decorrelation (JA)
Evaluation
Conclusions & Future Work
Matthias Brantner, 14. Januar 2005
Decorrelation of Nested SQL Queries
Nesting in SQL
Motivation & Outline
Preliminaries
● Nesting in SQL
Decorrelation (N & J)
Decorrelation (JA)
Evaluation
Conclusions & Future Work
Matthias Brantner, 14. Januar 2005
■
Kim’s Classification
◆ Uncorrelated subqueries
■ Aggregate (A), or not (N)
◆ Correlated subqueries
■ Aggregate (JA), or not (J)
Type: A
SELECT *
FROM
R
WHERE
R.A1 θ (SELECT f(S.A4)
FROM S)
Decorrelation of Nested SQL Queries
Nesting in SQL
Motivation & Outline
Preliminaries
● Nesting in SQL
Decorrelation (N & J)
Decorrelation (JA)
Evaluation
Conclusions & Future Work
Matthias Brantner, 14. Januar 2005
■
Kim’s Classification
◆ Uncorrelated subqueries
■ Aggregate (A), or not (N)
◆ Correlated subqueries
■ Aggregate (JA), or not (J)
Type: JA
SELECT *
FROM
R
WHERE
R.A1 θ (SELECT f(S.A4)
FROM S
WHERE R.A2 = S.A3)
Decorrelation of Nested SQL Queries
Nesting in SQL
Motivation & Outline
Preliminaries
● Nesting in SQL
Decorrelation (N & J)
Decorrelation (JA)
Evaluation
Conclusions & Future Work
Matthias Brantner, 14. Januar 2005
■
Kim’s Classification
◆ Uncorrelated subqueries
■ Aggregate (A), or not (N)
◆ Correlated subqueries
■ Aggregate (JA), or not (J)
Type: N
SELECT *
FROM
R
WHERE
R.A1 IN (SELECT S.A4
FROM S)
Decorrelation of Nested SQL Queries
Nesting in SQL
Motivation & Outline
■
Preliminaries
● Nesting in SQL
Decorrelation (N & J)
Decorrelation (JA)
Evaluation
Conclusions & Future Work
Type: J
SELECT
FROM
WHERE
■
Matthias Brantner, 14. Januar 2005
Kim’s Classification
◆ Uncorrelated subqueries
■ Aggregate (A), or not (N)
◆ Correlated subqueries
■ Aggregate (JA), or not (J)
*
R
R.A1 IN (SELECT S.A4
FROM S
WHERE R.A2 = S.A3)
Muralikrishna: Simple, Linear, and Tree Query
Decorrelation of Nested SQL Queries
Decorrelation of J queries (1)
Motivation & Outline
■
Simple J query (Conjunction)
SELECT
FROM
WHERE
Preliminaries
Decorrelation (N & J)
● Decorrelation of J queries (1)
● Decorrelation of J queries (2)
● Linear and Tree Queries of N
&J
*
R
R.A1 IN (SELECT S.B1
FROM S
WHERE R.A2 = S.B2)
Decorrelation (JA)
Evaluation
Conclusions & Future Work
■
Evaluation Strategies
Correlated:
σ
R
∃A1 =B1
σA2 =B2
S
Matthias Brantner, 14. Januar 2005
Decorrelated:
(A1=B1∧A2=B2)
R
ΠB1 ,B2
S
Decorrelation of Nested SQL Queries
Decorrelation of J queries (2)
Motivation & Outline
Preliminaries
Decorrelation (N & J)
● Decorrelation of J queries (1)
■
Simple type J query (Disjunction)
SELECT
FROM
WHERE
● Decorrelation of J queries (2)
● Linear and Tree Queries of N
&J
Decorrelation (JA)
Evaluation
*
R
R.A1 IN (SELECT S.B1
FROM S
WHERE R.A2 = S.B2) or c
Decorrelated:
Correlated:
σ
c∨(A1=B1∧A2=B2)
Conclusions & Future Work
R
or
∃A1 =B1
σA2 =B2
S
Matthias Brantner, 14. Januar 2005
Bypass:
c
R
.
∪
σ
ΠB1 ,B2
σc±
∃A1 =B1
S
R
σA2 =B2
S
Decorrelation of Nested SQL Queries
Decorrelation of J queries (2)
Motivation & Outline
Preliminaries
Decorrelation (N & J)
● Decorrelation of J queries (1)
■
Simple type J query (Disjunction)
SELECT
FROM
WHERE
● Decorrelation of J queries (2)
● Linear and Tree Queries of N
&J
Decorrelation (JA)
Evaluation
*
R
R.A1 IN (SELECT S.B1
FROM S
WHERE R.A2 = S.B2) or c
Decorrelated:
Correlated:
σ
or
∃A1 =B1
σA2 =B2
S
Matthias Brantner, 14. Januar 2005
.
c∨(A1=B1∧A2=B2)
Conclusions & Future Work
R
Bypass:
c
R
∪
A1=B1∧A2=B2
ΠB1 ,B2
S
σc±
ΠB1 ,B2
R
S
Decorrelation of Nested SQL Queries
Linear and Tree Queries of N & J
Motivation & Outline
■
Linear Queries
◆ Analog
◆ Strategy: bottom-up or top-down
■
Tree Queries
Preliminaries
Decorrelation (N & J)
● Decorrelation of J queries (1)
● Decorrelation of J queries (2)
● Linear and Tree Queries of N
&J
Decorrelation (JA)
Evaluation
Conclusions & Future Work
SELECT
FROM
WHERE
A1
R
A1 IN
OR
A3 IN
ΠA1
(SELECT B1
FROM
S
WHERE
A2 = B2 )
.
∪
A =C ∧A =C
3
(SELECT C1
FROM
T
WHERE
A4 = C2 )
±A =B ∧A =B
1
R
1
2
2
ΠB1 ,B2
1
4
ΠC1 ,C2
T
S
Matthias Brantner, 14. Januar 2005
Decorrelation of Nested SQL Queries
2
Decorrelation of JA queries (1)
Motivation & Outline
■
Simple JA query (Conjunction)
SELECT
FROM
WHERE
Preliminaries
Decorrelation (N & J)
Decorrelation (JA)
*
R
R.A1 θ
● Decorrelation of JA queries
(1)
● Decorrelation of JA queries
(2)
● JA Tree Queries
● JA Subquery contains
disjunction
● Multisets and Quantifiers
(SELECT f(S.B1)
FROM S
WHERE R.A2 = S.B2)
Evaluation Strategies
Correlated:
■
σ
Decorrelated:
σA1 θ g
Evaluation
Conclusions & Future Work
θ
R
)
Af (ε=B
2
Af
A1
R
σ
S
Matthias Brantner, 14. Januar 2005
2
Γg;=B2 ; f
S
A2 = B2
Decorrelation of Nested SQL Queries
Decorrelation of JA queries (2)
Motivation & Outline
Preliminaries
Decorrelation (N & J)
Decorrelation (JA)
■
Simple JA query (Disjunction)
SELECT
FROM
WHERE
*
R
R.A1 θ
● Decorrelation of JA queries
(1)
● Decorrelation of JA queries
(2)
● JA Tree Queries
● JA Subquery contains
(SELECT f(S.B1)
FROM S
WHERE R.A2 = S.B2) or c
Decorrelated:
Correlated:
σ
.
∪
disjunction
● Multisets and Quantifiers
R
Evaluation
or
θ
Conclusions & Future Work
ΠA (R)
c
σA1 θ g
Af
A1
Af (ε=B)
2
σ
S
Matthias Brantner, 14. Januar 2005
A2 = B2
2
σc±
Γg;=B2 ;f
R
S
Decorrelation of Nested SQL Queries
JA Tree Queries
Motivation & Outline
Preliminaries
Decorrelation (N & J)
Decorrelation (JA)
● Decorrelation of JA queries
(1)
● Decorrelation of JA queries
(2)
● JA Tree Queries
● JA Subquery contains
disjunction
● Multisets and Quantifiers
■
ΠA1
Reorder join an bypass selection
SELECT
FROM
WHERE
.
A1
R
A1 θ (SELECT f(B1 )
FROM
S
WHERE
A2 = B2 )
or
A3 θ (SELECT f(C1 )
FROM T
WHERE A4 = C2 )
∪
ΠA (R)
σA3 θ g2
)
Af (ε=C
4
Evaluation
Conclusions & Future Work
σA±1 θ g1
Γg2 ;=C2 ;f
)
Af (ε=B
T
2
R
2
2
Γg1 ;=B2 ;f
S
Matthias Brantner, 14. Januar 2005
Decorrelation of Nested SQL Queries
JA Subquery contains disjunction
Motivation & Outline
■
Correlation predicate occurs disjunctively
σA1 =g
Preliminaries
Decorrelation (N & J)
Decorrelation (JA)
● Decorrelation of JA queries
(1)
● Decorrelation of JA queries
(2)
● JA Tree Queries
● JA Subquery contains
disjunction
● Multisets and Quantifiers
■
Linear query is a special case
SELECT
FROM
WHERE
*
R
A1 =
SELECT
FROM
WHERE
Γg;t1=t1′ ;count
Πt1′ :t1
count(B1 )
S
A2 = B2
or p
.
∪
σp
Evaluation
Conclusions & Future Work
±
A2 =B2
tidt1
S
R
Matthias Brantner, 14. Januar 2005
Decorrelation of Nested SQL Queries
Multisets and Quantifiers
Motivation & Outline
■
Multisets
◆ Works for N & J queries
◆ Works for JA queries using tid as key
■
EXISTS, NOT EXISTS, ANY and SOME with
(in)equality as comparator
◆ reduce to N or J
■
SOME, ANY, and ALL with arbitrary comparator
◆ reduce to A or JA
Preliminaries
Decorrelation (N & J)
Decorrelation (JA)
● Decorrelation of JA queries
(1)
● Decorrelation of JA queries
(2)
● JA Tree Queries
● JA Subquery contains
disjunction
● Multisets and Quantifiers
Evaluation
Conclusions & Future Work
Matthias Brantner, 14. Januar 2005
Decorrelation of Nested SQL Queries
Evaluation
Motivation & Outline
■
Synthetic
◆ 4 integer columns
◆ 10.000, 50.000 and 100.000 rows (180kB,
1.1MB, and 2.1MB)
■
TPC-H
◆ Scaling factors between 0.01 and 10 (11MB 11GB)
Preliminaries
Decorrelation (N & J)
Decorrelation (JA)
Evaluation
● Evaluation
● Evaluation of J queries (1)
● Evaluation of J queries (2)
● Evaluation of JA queries (1)
Conclusions & Future Work
Matthias Brantner, 14. Januar 2005
Decorrelation of Nested SQL Queries
Evaluation of J queries (1)
Motivation & Outline
Preliminaries
SELECT
FROM
WHERE
Decorrelation (N & J)
Decorrelation (JA)
Evaluation
● Evaluation
*
R
R.A1 IN (SELECT S.A4
FROM S
WHERE R.A2 = S.A3)
or R.A4 > 1500;
System A
● Evaluation of J queries (1)
Natix Semi-join
SF2
SF2
● Evaluation of J queries (2)
● Evaluation of JA queries (1)
Conclusions & Future Work
SF1
1
5
10
SF1
1
5
10
1
10.1
51.3
102
1
4.17
4.06
4.15
5
50.6
260
521
5
4.13
4.07
4.1
10
100
523
1044
10
5.14
4.08
4.07
Natix Corr.
Natix Decorr.
SF2
Matthias Brantner, 14. Januar 2005
SF2
SF1
1
5
10
SF1
1
5
10
1
10.9
53.1
104
1
0.24
0.2
0.21
5
45.5
228
452
5
0.21
0.2
0.21
10
86.3
431
852
10
0.21
0.2
0.29
Decorrelation of Nested SQL Queries
Evaluation of J queries (2)
Motivation & Outline
Preliminaries
Decorrelation (N & J)
Decorrelation (JA)
Evaluation
● Evaluation
● Evaluation of J queries (1)
● Evaluation of J queries (2)
● Evaluation of JA queries (1)
SELECT
FROM
WHERE
o orderpriority, count(*) as order count
orders
o orderpriority = ’1-URGENT’
OR EXISTS (
SELECT *
FROM
lineitem
WHERE
l commitdate < l receiptdate
AND
o orderkey = l orderkey)
GROUP BY o orderpriority
ORDER BY o orderpriority
Factor
Conclusions & Future Work
Evaluator
0.01
0.05
0.1
0.5
1
2
5
10
A
84.1
3716
n/a
n/a
n/a
n/a
n/a
n/a
Postgres
388
11026
n/a
n/a
n/a
n/a
n/a
n/a
Natix
Matthias Brantner, 14. Januar 2005
•
canonical
79.7
3631
n/a
n/a
n/a
n/a
n/a
n/a
•
join
17.7
470.8
2033
n/a
n/a
n/a
n/a
n/a
•
decorrelated
0.19
0.48
0.87
3.67
15.6
34.5
79.3
189
Decorrelation of Nested SQL Queries
Evaluation of JA queries (1)
Motivation & Outline
Preliminaries
Decorrelation (N & J)
Decorrelation (JA)
Evaluation
SELECT
FROM
WHERE
*
R
R.A1 = (SELECT count(S.A3)
FROM S
WHERE R.A2 = S.A4)
or R.A4 > 1500;
● Evaluation
System A
● Evaluation of J queries (1)
SF2
● Evaluation of J queries (2)
● Evaluation of JA queries (1)
Conclusions & Future Work
Natix Corr.
SF2
SF1
1
5
10
SF1
1
5
10
1
10.6
55.8
111
1
10.9
54.9
109
5
49.5
259
520
5
46.8
235
474
10
98.4
515
1029
10
88.6
450
899
Natix Decorr.
SF2
Matthias Brantner, 14. Januar 2005
SF1
1
5
10
1
0.2
0.24
0.3
5
0.78
0.87
0.98
10
1.6
1.65
1.74
Decorrelation of Nested SQL Queries
Conclusion & Future Work
Motivation & Outline
■
Conclusions
◆ Decorrelation of all kinds of nested queries
■ N, J and JA
■ Simple, Tree and Linear
■ Sets and Multisets
◆ No matter if connection or correlation predicate
occurs disjunctively
◆ Up to 4 orders of magnitude faster
■
Future Work
◆ Examples for alternating types of nested
queries
◆ Evaluate more TPC-H queries
◆ Proofs
Preliminaries
Decorrelation (N & J)
Decorrelation (JA)
Evaluation
Conclusions & Future Work
● Conclusion & Future Work
Matthias Brantner, 14. Januar 2005
Decorrelation of Nested SQL Queries
Herunterladen