Haskell boards the Ferry A Database Coprocessor for Haskell George Giorgidze Torsten Grust Tom Schreiber Jeroen Weijers www-db.informatik.uni-tuebingen.de IFL 2010, Utrecht University Jeroen Weijers Dienstag, 7. September 2010 1 Universität Tübingen Haskell turns into SQL Runtime Heap Haskell Program Result Database System Input Data Jeroen Weijers Dienstag, 7. September 2010 2 Universität Tübingen Haskell turns into SQL Runtime Heap Result Database System SQL Queries Input Data Jeroen Weijers Dienstag, 7. September 2010 2 Universität Tübingen Haskell turns into SQL Runtime Heap Result Database System SQL Queries Input Data Jeroen Weijers Dienstag, 7. September 2010 2 Universität Tübingen Haskell turns into SQL Runtime Heap Result Database System SQL Queries Input Data Database-supported program execution Jeroen Weijers Dienstag, 7. September 2010 2 Universität Tübingen State of the Art Facilities SQL LINQ Links Haskell DB Ferry Jeroen Weijers Dienstag, 7. September 2010 Category QLA LIN LIN LIB LIB Feature Respects list order Supports data nesting Avoids query avalanches Is statically type-checked Guarantees translation to SQL Has compositional syntax and semantics 3 Universität Tübingen State of the Art Facilities SQL LINQ Links Haskell DB Ferry Jeroen Weijers Dienstag, 7. September 2010 Category QLA LIN LIN LIB LIB Feature Respects list order Supports data nesting Avoids query avalanches Is statically type-checked Guarantees translation to SQL Has compositional syntax and semantics 3 Universität Tübingen State of the Art Facilities SQL LINQ Links Haskell DB Ferry Category QLA LIN LIN LIB LIB Feature Respects list order Supports data nesting Avoids query avalanches Is statically type-checked Guarantees translation to SQL Has compositional syntax and semantics take n e ++ drop n e ≠ e Jeroen Weijers Dienstag, 7. September 2010 3 Universität Tübingen State of the Art Facilities SQL LINQ Links Haskell DB Ferry Feature Respects list order Supports data nesting Avoids query avalanches Is statically type-checked Guarantees translation to SQL Has compositional syntax and semantics Category QLA LIN LIN LIB LIB #Queries 10.000 1.000 100 10 1 10 100 1000 10000 Database size Jeroen Weijers Dienstag, 7. September 2010 3 Universität Tübingen State of the Art Facilities SQL LINQ Links Haskell DB Ferry Feature Respects list order Supports data nesting Avoids query avalanches Is statically type-checked Guarantees translation to SQL Has compositional syntax and semantics Category QLA LIN LIN LIB LIB #Queries 10.000 1.000 100 10 1 10 100 1000 10000 Database size Jeroen Weijers Dienstag, 7. September 2010 3 Universität Tübingen State of the Art Facilities SQL LINQ Links Haskell DB Ferry Jeroen Weijers Dienstag, 7. September 2010 Category QLA LIN LIN LIB LIB Feature Respects list order Supports data nesting Avoids query avalanches Is statically type-checked Guarantees translation to SQL Has compositional syntax and semantics 3 Universität Tübingen Stick with Comprehensions What features can a facility have in a category? hasFeatures :: String ! [String] hasFeatures f = [ feat | (fac,feat) ← features,fac ≡ f ] means :: String ! String means f = head [ mean | (feat,mean) ← meanings,feat ≡ f ] query :: [(String , [String])] query = [ (the cat, nub $ concat $ map (map means ◦ hasFeatures) fac) | (fac, cat) ← facilities, then group by cat ] Jeroen Weijers Dienstag, 7. September 2010 4 Universität Tübingen Stick with Comprehensions What features can a facility have in a category? hasFeatures :: String ! [String] hasFeatures f = [ feat | (fac,feat) ← features,fac ≡ f ] means :: String ! String means f = head [ mean | (feat,mean) ← meanings,feat ≡ f ] query :: [(String , [String])] query = [ (the cat, nub $ concat $ map (map means ◦ hasFeatures) fac) | (fac, cat) ← facilities, then group by cat ] Jeroen Weijers Dienstag, 7. September 2010 4 Universität Tübingen Stick with Comprehensions What features can a facility have in a category? hasFeatures :: String ! [String] hasFeatures f = [ feat | (fac,feat) ← features,fac ≡ f ] means :: String ! String means f = head [ mean | (feat,mean) ← meanings,feat ≡ f ] query :: [(String , [String])] query = [ (the cat, nub $ concat $ map (map means ◦ hasFeatures) fac) | (fac, cat) ← facilities, then group by cat ] Jeroen Weijers Dienstag, 7. September 2010 4 Universität Tübingen Stick with Comprehensions What features can a facility have in a category? hasFeatures :: Q String ! Q [String] hasFeatures f = [$qc| feat | (fac,feat) ← table features,fac ≡ f |] means :: Q String ! Q String means f = head [$qc| mean | (feat,mean) ← table meanings,feat ≡ f |] query :: IO [(String , [String])] query = fromQ connection [$qc| (the cat, nub $ concat $ map (map means ◦ hasFeatures) fac) | (fac, cat) ← table facilities, then group by cat |] Jeroen Weijers Dienstag, 7. September 2010 4 Universität Tübingen Stick with Combinators map, filter, head, tail, length, zip, sortWith, the, ... Jeroen Weijers Dienstag, 7. September 2010 5 Universität Tübingen Stick with Combinators map, filter, head, tail, length, zip, sortWith, the, ... Map from prelude: map :: (a ! b) ! [a] ! [b] Jeroen Weijers Dienstag, 7. September 2010 5 Universität Tübingen Stick with Combinators map, filter, head, tail, length, zip, sortWith, the, ... Map from prelude: map :: (a ! b) ! [a] ! [b] Map from Ferry: map :: (QA a, QA b) (Q a ! Q b) ! Q [a] ! Q [b] Jeroen Weijers Dienstag, 7. September 2010 5 Universität Tübingen Stick with Combinators map, filter, head, tail, length, zip, sortWith, the, ... Map from prelude: map :: (a ! b) ! [a] ! [b] Map from Ferry: map :: (QA a, QA b) (Q a ! Q b) ! Q [a] ! Q [b] Restrict to supported queryable types Jeroen Weijers Dienstag, 7. September 2010 5 Universität Tübingen Stick with Combinators map, filter, head, tail, length, zip, sortWith, the, ... Map from prelude: map :: (a ! b) ! [a] ! [b] Map from Ferry: map :: (QA a, QA b) (Q a ! Q b) ! Q [a] ! Q [b] Restrict to supported queryable types Jeroen Weijers Dienstag, 7. September 2010 Q datatype Q datatype builds represents query the query 5 Universität Tübingen Stick with Combinators map, filter, head, tail, length, zip, sortWith, the, ... Map from prelude: map :: (a ! b) ! [a] ! [b] Map from Ferry: map :: (QA a, QA b) (Q a ! Q b) ! Q [a] ! Q [b] Restrict to supported queryable types Q datatype Q datatype builds represents query the query A few combinators are not supported (yet) (e.g. foldr, foldl) Jeroen Weijers Dienstag, 7. September 2010 5 Universität Tübingen A Haskell View of the Relational Data Model CREATE TABLE "Facilities" (facility varchar(100) NOT NULL, category varchar(100) NOT NULL); data Facility = Facility {facility :: String, category :: String} type Facilities = [Facility] Jeroen Weijers Dienstag, 7. September 2010 6 Universität Tübingen A Haskell View of the Relational Data Model CREATE TABLE "Facilities" (facility varchar(100) NOT NULL, category varchar(100) NOT NULL); data Facility = Facility {facility :: String, category :: String} type Facilities = [Facility] Int, Bool, String, Double, [a], (), (a1,...,an), {x1::a1,..., xn::an} Jeroen Weijers Dienstag, 7. September 2010 6 Universität Tübingen Turning Haskell into SQL Compile time Run time Heap List Comprehensions Haskell Combinators Table Algebra SQL Queries DB Tabular result Value Jeroen Weijers Dienstag, 7. September 2010 7 Universität Tübingen Avalanche safety LINQ/HaskellDB Ferry Jeroen Weijers Dienstag, 7. September 2010 8 Universität Tübingen 10.000 Avalanche safety # Queries 1.000 100 10 LINQ/HaskellDB Ferry 1 10 Jeroen Weijers Dienstag, 7. September 2010 100 1000 Database size 8 10000 Universität Tübingen 10.000 Avalanche safety # Queries 1.000 100 10 LINQ/HaskellDB Ferry 1 10 Jeroen Weijers Dienstag, 7. September 2010 100 1000 Database size 8 10000 Universität Tübingen 10.000 Avalanche safety # Queries 1.000 100 10 LINQ/HaskellDB Ferry 1 10 Jeroen Weijers Dienstag, 7. September 2010 100 1000 Database size 8 10000 Universität Tübingen Avalanche safety 10.000 LINQ/HaskellDB Ferry # Queries 1.000 100 10 1 10 100 1000 10000 Database size Program’s result type determines # of queries [(String,[String])] Jeroen Weijers Dienstag, 7. September 2010 8 Universität Tübingen Avalanche safety 10.000 LINQ/HaskellDB Ferry # Queries 1.000 100 10 1 10 100 1000 10000 Database size Program’s result type determines # of queries [(String,[String])] Jeroen Weijers Dienstag, 7. September 2010 8 Universität Tübingen Avalanche safety 10.000 LINQ/HaskellDB Ferry # Queries 1.000 100 10 1 10 100 1000 10000 Database size Program’s result type determines # of queries [[]] SQL Jeroen Weijers Dienstag, 7. September 2010 SQL 8 Universität Tübingen Avalanche safety 10.000 LINQ/HaskellDB Ferry # Queries 1.000 100 10 1 10 100 1000 10000 Database size Program’s result type determines # of queries [[]] SQL Jeroen Weijers Dienstag, 7. September 2010 SQL Queries in a bundle are independent: concurrent / on-demand execution OK 8 Universität Tübingen Future work • Functions as result of subquery • User-defined data types • Admit (limited) recursive functions • Even tighter integration with Haskell • Explore the many similarities with Data Parallel Haskell (DPH) Jeroen Weijers Dienstag, 7. September 2010 9 Universität Tübingen Haskell on a Ferry • The database as coprocessor • Familiar Haskell syntax and semantics • Support nested data and ordered lists • Guaranteed avalanche safety Jeroen Weijers Dienstag, 7. September 2010 10 Universität Tübingen Haskell turns into SQL Runtime Heap Haskell Program Result Database System Input Data Jeroen Weijers Dienstag, 7. September 2010 11 Universität Tübingen Haskell turns into SQL Runtime Heap Result Database System SQL Queries Input Data Jeroen Weijers Dienstag, 7. September 2010 11 Universität Tübingen Haskell turns into SQL Runtime Heap Result Database System SQL Queries Input Data Jeroen Weijers Dienstag, 7. September 2010 11 Universität Tübingen Haskell turns into SQL Runtime Heap Result Database System SQL Queries Input Data www.ferry-lang.org Jeroen Weijers Dienstag, 7. September 2010 11 Universität Tübingen Bonus Slides The following slides were not part of the original presentation. Jeroen Weijers Dienstag, 7. September 2010 12 Universität Tübingen Similarities with DPH Ferry [:(a,b):] is represented by ([:a:],[:b:]) where array length is synchronized Fields of a tuple live in separate, adjacent columns Jeroen Weijers Dienstag, 7. September 2010 pos 1 2 3 val1 "A" "B" "C" ⋮ ⋮ DPH val2 10 20 30 [:"A", "B", "C", ... :] 13 [:10, 20, 30, ... :] Universität Tübingen Ferry Similarities with DPH Foreign keys are descriptors for nested lists Nested arrays lead to offset/length descriptors [[1,2],[],[3],[4,5,6]] iter pos val 1 1 ▢ 1 2 ▥ 1 3 ▤ 1 4 ▨ Jeroen Weijers Dienstag, 7. September 2010 box ▢ ▢ ▤ ▨ ▨ ▨ DPH [: [:1,2:],[::],[:3:], [:4,5,6:] :] pos 1 2 1 1 2 3 val 1 2 3 4 5 6 [:0, 2, 2, 3 :] 14 [:2, 0, 1, 3 :] [:1, 2, 3, 4, 5, 6 :] Universität Tübingen Ferry Similarities with DPH To evaluate iteration in parallel, expressions are lifted To evaluate iteration in parallel, expressions are lifted [$qc| x+15 | x <-­‐ toQ [1,90,4]|] iter 1 2 3 Jeroen Weijers Dienstag, 7. September 2010 × DPH [: x+15 | x <-­‐ [:1,90,4:] :] val 15 replicateP (lengthP [:1,90,4:]) 15 15 Universität Tübingen Ferry Similarities with DPH DPH Compilation targets machines with data-­‐parallel primitives Compilation targets vector primitives of modern CPU architectures σ π ⋈ fst^ snd^ *^ sumP bpermuteP Jeroen Weijers Dienstag, 7. September 2010 16 Universität Tübingen Ferry Similarities with DPH DPH Sample relational plan Sample DPH Core svMul sv v = [$qc| f*(v!i) | (i,f) <-­‐ sv |] svMul sv v = [: f*(v!i) | (i,f) <-­‐ sv :] svMul toQ [(1,10),...] toQ [1,90,4,...] svMul [:(1,10),...:] [:1,90,4,...:] π pos1,val:val2*val3 snd^ sv *^ bpermuteP v (fst^ sv) ⋈ val1 = pos2 pos1 val1 val2 1 1 10 2 3 30 Jeroen Weijers Dienstag, 7. September 2010 pos val3 2 1 1 2 90 3 4 17 Universität Tübingen Embedding internals Feature List Comprehension Type Correctness Restricting types Pattern matching Algebraic code Boilerplate code Jeroen Weijers Dienstag, 7. September 2010 Implementation technique Quasi Quoting ADTs and Phantom Types Type classes View patterns Combinators Template Haskell 18 Universität Tübingen List comprehensions hasFeatures f = [$qc| feat | (fac,feat) ← table features,fac ≡ f |] Quasi Quoter hasFeatures f = map (λ(fac, feat) ! feat) $ filter (λ(fac,feat) ! fac ≡ f) (table features) Jeroen Weijers Dienstag, 7. September 2010 19 Universität Tübingen Internal datatype data Exp = VarE String | UnitE | BoolE Bool | CharE Char | IntE Int | TupleE Exp Exp [Exp] | ListE [Exp] | FuncE (Exp ! Exp) | AppE Exp Exp | TableE String Type data Q a = Q Exp Jeroen Weijers Dienstag, 7. September 2010 20 Universität Tübingen Restricting types class QA a where toQ :: a ! Q a fromQ :: Conn ! Q a ! IO a instance QA Int where toQ i = IntE i fromQ c (IntE i) = ... class QA a TA a where table :: String ! Q [a] table = ... Jeroen Weijers Dienstag, 7. September 2010 21 instance TA Int where instance TA Bool where Universität Tübingen Pattern matching Pattern matching on QA data: Tuples: (λ(view ! (a,b)) ! ...) Nested tuples: (λ(view ! (a,(view -­‐> (b,c)) ! ...) Records: (λ(view ! (UserV {name, id}) ! ...) Jeroen Weijers Dienstag, 7. September 2010 22 Universität Tübingen Targeting SQL:99 DBMSs πa Projection pσp Selection ✶ Join ××Cross Product δ Duplicate Elimination @ �a Constant Column Attachment a b c 1 c2 Literal Table Construction g operator � agg Row Ranking Aggregation R Database Table Access Jeroen Weijers Dienstag, 7. September 2010 23 Universität Tübingen Algebraic code Q1 Q2 item1 order by item1 nest:item1 , item1 :item6 order by nest,item1 δ δ πitem1 :cat πitem1 , facilities item6 � item5 =item4 πitem5 :feature, item6 :meaning meanings � item2 =item3 πitem1 :cat, item2 :fac facilities πitem3 :fac, item4 :feature features Jeroen Weijers Universität Tübingen 24 Q1,2 implementing Program 10. Optimized algebraic plan bundle P. Dienstag, 7. September 2010 SQL code SELECT DISTINCT t0000.cat AS item1 FROM facilities AS t0000 ORDER BY t0000.cat ASC; SELECT DISTINCT t0001.cat AS nest, t0000.meaning AS item1 FROM meanings AS t0000, facilities AS t0001, features AS t0002 WHERE t0000.feature = t0002.feature AND t0001.fac = t0002.fac ORDER BY t0001.cat ASC, t0000.meaning ASC; Jeroen Weijers Dienstag, 7. September 2010 25 Universität Tübingen Boilerplate code QA instance for tuples are generated: $(deriveTupleQA 3) Template Haskell instance (QA a, QA b, QA c) QA (a,b,c) where ... QA instances automatically derived for: •Tuples •Database tables Jeroen Weijers Dienstag, 7. September 2010 26 Universität Tübingen Example results Features facilities can have in a category: [("QLA", ["avoids query avalanches", "guarantees translation to SQL", "is statically type-­‐checked"]), ("LIN", ["guarantees translation to SQL", "has compositional syntax and semantics", "is statically type-­‐checked", "supports data nesting"]), ("LIB", ["Respects list order", "Supports data nesting", "Avoids query avalanches", "is statically type-­‐checked", "guarantees translation to SQL", "has compositional syntax and semantics"]) ] Jeroen Weijers Dienstag, 7. September 2010 27 Universität Tübingen