Decorrelation of Nested SQL Queries in the Presence of Disjunction Matthias Brantner & Norman May University Mannheim msb|[email protected] Matthias Brantner, 14. Januar 2005 Decorrelation of Nested SQL Queries Motivation Motivation & Outline Conjunctive Nested Query: ● Motivation ● Outline Preliminaries Decorrelation (N & J) SELECT FROM WHERE Decorrelation (JA) * R R.A1 IN (SELECT S.A4 FROM S WHERE R.A2 = S.A3); Evaluation SF2 SF1 1 5 10 1 0.01 0.04 0.06 5 0.04 0.06 0.09 10 0.07 0.09 0.50 Conclusions & Future Work Disjunctive Nested Query: SELECT FROM WHERE Matthias Brantner, 14. Januar 2005 * R R.A1 IN (SELECT S.A4 FROM S WHERE R.A2 = S.A3) or R.A4 > 1500 SF2 SF1 1 5 10 1 10.1 51.3 102 5 50.6 260 521 10 100 523 1044 Decorrelation of Nested SQL Queries Outline Motivation & Outline ● Motivation ■ Nesting in SQL ■ Decorrelation (Conjunctive and Disjunctive) ■ Evaluation ■ Conclusion & Future Work ● Outline Preliminaries Decorrelation (N & J) Decorrelation (JA) Evaluation Conclusions & Future Work Matthias Brantner, 14. Januar 2005 Decorrelation of Nested SQL Queries Nesting in SQL Motivation & Outline Preliminaries ● Nesting in SQL Decorrelation (N & J) Decorrelation (JA) Evaluation Conclusions & Future Work Matthias Brantner, 14. Januar 2005 ■ Kim’s Classification ◆ Uncorrelated subqueries ■ Aggregate (A), or not (N) ◆ Correlated subqueries ■ Aggregate (JA), or not (J) Type: A SELECT * FROM R WHERE R.A1 θ (SELECT f(S.A4) FROM S) Decorrelation of Nested SQL Queries Nesting in SQL Motivation & Outline Preliminaries ● Nesting in SQL Decorrelation (N & J) Decorrelation (JA) Evaluation Conclusions & Future Work Matthias Brantner, 14. Januar 2005 ■ Kim’s Classification ◆ Uncorrelated subqueries ■ Aggregate (A), or not (N) ◆ Correlated subqueries ■ Aggregate (JA), or not (J) Type: JA SELECT * FROM R WHERE R.A1 θ (SELECT f(S.A4) FROM S WHERE R.A2 = S.A3) Decorrelation of Nested SQL Queries Nesting in SQL Motivation & Outline Preliminaries ● Nesting in SQL Decorrelation (N & J) Decorrelation (JA) Evaluation Conclusions & Future Work Matthias Brantner, 14. Januar 2005 ■ Kim’s Classification ◆ Uncorrelated subqueries ■ Aggregate (A), or not (N) ◆ Correlated subqueries ■ Aggregate (JA), or not (J) Type: N SELECT * FROM R WHERE R.A1 IN (SELECT S.A4 FROM S) Decorrelation of Nested SQL Queries Nesting in SQL Motivation & Outline ■ Preliminaries ● Nesting in SQL Decorrelation (N & J) Decorrelation (JA) Evaluation Conclusions & Future Work Type: J SELECT FROM WHERE ■ Matthias Brantner, 14. Januar 2005 Kim’s Classification ◆ Uncorrelated subqueries ■ Aggregate (A), or not (N) ◆ Correlated subqueries ■ Aggregate (JA), or not (J) * R R.A1 IN (SELECT S.A4 FROM S WHERE R.A2 = S.A3) Muralikrishna: Simple, Linear, and Tree Query Decorrelation of Nested SQL Queries Decorrelation of J queries (1) Motivation & Outline ■ Simple J query (Conjunction) SELECT FROM WHERE Preliminaries Decorrelation (N & J) ● Decorrelation of J queries (1) ● Decorrelation of J queries (2) ● Linear and Tree Queries of N &J * R R.A1 IN (SELECT S.B1 FROM S WHERE R.A2 = S.B2) Decorrelation (JA) Evaluation Conclusions & Future Work ■ Evaluation Strategies Correlated: σ R ∃A1 =B1 σA2 =B2 S Matthias Brantner, 14. Januar 2005 Decorrelated: (A1=B1∧A2=B2) R ΠB1 ,B2 S Decorrelation of Nested SQL Queries Decorrelation of J queries (2) Motivation & Outline Preliminaries Decorrelation (N & J) ● Decorrelation of J queries (1) ■ Simple type J query (Disjunction) SELECT FROM WHERE ● Decorrelation of J queries (2) ● Linear and Tree Queries of N &J Decorrelation (JA) Evaluation * R R.A1 IN (SELECT S.B1 FROM S WHERE R.A2 = S.B2) or c Decorrelated: Correlated: σ c∨(A1=B1∧A2=B2) Conclusions & Future Work R or ∃A1 =B1 σA2 =B2 S Matthias Brantner, 14. Januar 2005 Bypass: c R . ∪ σ ΠB1 ,B2 σc± ∃A1 =B1 S R σA2 =B2 S Decorrelation of Nested SQL Queries Decorrelation of J queries (2) Motivation & Outline Preliminaries Decorrelation (N & J) ● Decorrelation of J queries (1) ■ Simple type J query (Disjunction) SELECT FROM WHERE ● Decorrelation of J queries (2) ● Linear and Tree Queries of N &J Decorrelation (JA) Evaluation * R R.A1 IN (SELECT S.B1 FROM S WHERE R.A2 = S.B2) or c Decorrelated: Correlated: σ or ∃A1 =B1 σA2 =B2 S Matthias Brantner, 14. Januar 2005 . c∨(A1=B1∧A2=B2) Conclusions & Future Work R Bypass: c R ∪ A1=B1∧A2=B2 ΠB1 ,B2 S σc± ΠB1 ,B2 R S Decorrelation of Nested SQL Queries Linear and Tree Queries of N & J Motivation & Outline ■ Linear Queries ◆ Analog ◆ Strategy: bottom-up or top-down ■ Tree Queries Preliminaries Decorrelation (N & J) ● Decorrelation of J queries (1) ● Decorrelation of J queries (2) ● Linear and Tree Queries of N &J Decorrelation (JA) Evaluation Conclusions & Future Work SELECT FROM WHERE A1 R A1 IN OR A3 IN ΠA1 (SELECT B1 FROM S WHERE A2 = B2 ) . ∪ A =C ∧A =C 3 (SELECT C1 FROM T WHERE A4 = C2 ) ±A =B ∧A =B 1 R 1 2 2 ΠB1 ,B2 1 4 ΠC1 ,C2 T S Matthias Brantner, 14. Januar 2005 Decorrelation of Nested SQL Queries 2 Decorrelation of JA queries (1) Motivation & Outline ■ Simple JA query (Conjunction) SELECT FROM WHERE Preliminaries Decorrelation (N & J) Decorrelation (JA) * R R.A1 θ ● Decorrelation of JA queries (1) ● Decorrelation of JA queries (2) ● JA Tree Queries ● JA Subquery contains disjunction ● Multisets and Quantifiers (SELECT f(S.B1) FROM S WHERE R.A2 = S.B2) Evaluation Strategies Correlated: ■ σ Decorrelated: σA1 θ g Evaluation Conclusions & Future Work θ R ) Af (ε=B 2 Af A1 R σ S Matthias Brantner, 14. Januar 2005 2 Γg;=B2 ; f S A2 = B2 Decorrelation of Nested SQL Queries Decorrelation of JA queries (2) Motivation & Outline Preliminaries Decorrelation (N & J) Decorrelation (JA) ■ Simple JA query (Disjunction) SELECT FROM WHERE * R R.A1 θ ● Decorrelation of JA queries (1) ● Decorrelation of JA queries (2) ● JA Tree Queries ● JA Subquery contains (SELECT f(S.B1) FROM S WHERE R.A2 = S.B2) or c Decorrelated: Correlated: σ . ∪ disjunction ● Multisets and Quantifiers R Evaluation or θ Conclusions & Future Work ΠA (R) c σA1 θ g Af A1 Af (ε=B) 2 σ S Matthias Brantner, 14. Januar 2005 A2 = B2 2 σc± Γg;=B2 ;f R S Decorrelation of Nested SQL Queries JA Tree Queries Motivation & Outline Preliminaries Decorrelation (N & J) Decorrelation (JA) ● Decorrelation of JA queries (1) ● Decorrelation of JA queries (2) ● JA Tree Queries ● JA Subquery contains disjunction ● Multisets and Quantifiers ■ ΠA1 Reorder join an bypass selection SELECT FROM WHERE . A1 R A1 θ (SELECT f(B1 ) FROM S WHERE A2 = B2 ) or A3 θ (SELECT f(C1 ) FROM T WHERE A4 = C2 ) ∪ ΠA (R) σA3 θ g2 ) Af (ε=C 4 Evaluation Conclusions & Future Work σA±1 θ g1 Γg2 ;=C2 ;f ) Af (ε=B T 2 R 2 2 Γg1 ;=B2 ;f S Matthias Brantner, 14. Januar 2005 Decorrelation of Nested SQL Queries JA Subquery contains disjunction Motivation & Outline ■ Correlation predicate occurs disjunctively σA1 =g Preliminaries Decorrelation (N & J) Decorrelation (JA) ● Decorrelation of JA queries (1) ● Decorrelation of JA queries (2) ● JA Tree Queries ● JA Subquery contains disjunction ● Multisets and Quantifiers ■ Linear query is a special case SELECT FROM WHERE * R A1 = SELECT FROM WHERE Γg;t1=t1′ ;count Πt1′ :t1 count(B1 ) S A2 = B2 or p . ∪ σp Evaluation Conclusions & Future Work ± A2 =B2 tidt1 S R Matthias Brantner, 14. Januar 2005 Decorrelation of Nested SQL Queries Multisets and Quantifiers Motivation & Outline ■ Multisets ◆ Works for N & J queries ◆ Works for JA queries using tid as key ■ EXISTS, NOT EXISTS, ANY and SOME with (in)equality as comparator ◆ reduce to N or J ■ SOME, ANY, and ALL with arbitrary comparator ◆ reduce to A or JA Preliminaries Decorrelation (N & J) Decorrelation (JA) ● Decorrelation of JA queries (1) ● Decorrelation of JA queries (2) ● JA Tree Queries ● JA Subquery contains disjunction ● Multisets and Quantifiers Evaluation Conclusions & Future Work Matthias Brantner, 14. Januar 2005 Decorrelation of Nested SQL Queries Evaluation Motivation & Outline ■ Synthetic ◆ 4 integer columns ◆ 10.000, 50.000 and 100.000 rows (180kB, 1.1MB, and 2.1MB) ■ TPC-H ◆ Scaling factors between 0.01 and 10 (11MB 11GB) Preliminaries Decorrelation (N & J) Decorrelation (JA) Evaluation ● Evaluation ● Evaluation of J queries (1) ● Evaluation of J queries (2) ● Evaluation of JA queries (1) Conclusions & Future Work Matthias Brantner, 14. Januar 2005 Decorrelation of Nested SQL Queries Evaluation of J queries (1) Motivation & Outline Preliminaries SELECT FROM WHERE Decorrelation (N & J) Decorrelation (JA) Evaluation ● Evaluation * R R.A1 IN (SELECT S.A4 FROM S WHERE R.A2 = S.A3) or R.A4 > 1500; System A ● Evaluation of J queries (1) Natix Semi-join SF2 SF2 ● Evaluation of J queries (2) ● Evaluation of JA queries (1) Conclusions & Future Work SF1 1 5 10 SF1 1 5 10 1 10.1 51.3 102 1 4.17 4.06 4.15 5 50.6 260 521 5 4.13 4.07 4.1 10 100 523 1044 10 5.14 4.08 4.07 Natix Corr. Natix Decorr. SF2 Matthias Brantner, 14. Januar 2005 SF2 SF1 1 5 10 SF1 1 5 10 1 10.9 53.1 104 1 0.24 0.2 0.21 5 45.5 228 452 5 0.21 0.2 0.21 10 86.3 431 852 10 0.21 0.2 0.29 Decorrelation of Nested SQL Queries Evaluation of J queries (2) Motivation & Outline Preliminaries Decorrelation (N & J) Decorrelation (JA) Evaluation ● Evaluation ● Evaluation of J queries (1) ● Evaluation of J queries (2) ● Evaluation of JA queries (1) SELECT FROM WHERE o orderpriority, count(*) as order count orders o orderpriority = ’1-URGENT’ OR EXISTS ( SELECT * FROM lineitem WHERE l commitdate < l receiptdate AND o orderkey = l orderkey) GROUP BY o orderpriority ORDER BY o orderpriority Factor Conclusions & Future Work Evaluator 0.01 0.05 0.1 0.5 1 2 5 10 A 84.1 3716 n/a n/a n/a n/a n/a n/a Postgres 388 11026 n/a n/a n/a n/a n/a n/a Natix Matthias Brantner, 14. Januar 2005 • canonical 79.7 3631 n/a n/a n/a n/a n/a n/a • join 17.7 470.8 2033 n/a n/a n/a n/a n/a • decorrelated 0.19 0.48 0.87 3.67 15.6 34.5 79.3 189 Decorrelation of Nested SQL Queries Evaluation of JA queries (1) Motivation & Outline Preliminaries Decorrelation (N & J) Decorrelation (JA) Evaluation SELECT FROM WHERE * R R.A1 = (SELECT count(S.A3) FROM S WHERE R.A2 = S.A4) or R.A4 > 1500; ● Evaluation System A ● Evaluation of J queries (1) SF2 ● Evaluation of J queries (2) ● Evaluation of JA queries (1) Conclusions & Future Work Natix Corr. SF2 SF1 1 5 10 SF1 1 5 10 1 10.6 55.8 111 1 10.9 54.9 109 5 49.5 259 520 5 46.8 235 474 10 98.4 515 1029 10 88.6 450 899 Natix Decorr. SF2 Matthias Brantner, 14. Januar 2005 SF1 1 5 10 1 0.2 0.24 0.3 5 0.78 0.87 0.98 10 1.6 1.65 1.74 Decorrelation of Nested SQL Queries Conclusion & Future Work Motivation & Outline ■ Conclusions ◆ Decorrelation of all kinds of nested queries ■ N, J and JA ■ Simple, Tree and Linear ■ Sets and Multisets ◆ No matter if connection or correlation predicate occurs disjunctively ◆ Up to 4 orders of magnitude faster ■ Future Work ◆ Examples for alternating types of nested queries ◆ Evaluate more TPC-H queries ◆ Proofs Preliminaries Decorrelation (N & J) Decorrelation (JA) Evaluation Conclusions & Future Work ● Conclusion & Future Work Matthias Brantner, 14. Januar 2005 Decorrelation of Nested SQL Queries