Slides - Universität Tübingen

Werbung
Haskell boards the Ferry
A Database Coprocessor for Haskell
George Giorgidze
Torsten Grust
Tom Schreiber
Jeroen Weijers
www-db.informatik.uni-tuebingen.de
IFL
2010,
Utrecht University
Jeroen
Weijers
Dienstag, 7. September 2010
1
Universität Tübingen
Haskell turns into SQL
Runtime Heap
Haskell Program
Result
Database System
Input Data
Jeroen Weijers
Dienstag, 7. September 2010
2
Universität Tübingen
Haskell turns into SQL
Runtime Heap
Result
Database System
SQL Queries
Input Data
Jeroen Weijers
Dienstag, 7. September 2010
2
Universität Tübingen
Haskell turns into SQL
Runtime Heap
Result
Database System
SQL Queries
Input Data
Jeroen Weijers
Dienstag, 7. September 2010
2
Universität Tübingen
Haskell turns into SQL
Runtime Heap
Result
Database System
SQL Queries
Input Data
Database-supported program execution
Jeroen Weijers
Dienstag, 7. September 2010
2
Universität Tübingen
State of the Art
Facilities
SQL
LINQ
Links
Haskell DB
Ferry
Jeroen Weijers
Dienstag, 7. September 2010
Category
QLA
LIN
LIN
LIB
LIB
Feature
Respects list order
Supports data nesting
Avoids query avalanches
Is statically type-checked
Guarantees translation to SQL
Has compositional syntax and semantics
3
Universität Tübingen
State of the Art
Facilities
SQL
LINQ
Links
Haskell DB
Ferry
Jeroen Weijers
Dienstag, 7. September 2010
Category
QLA
LIN
LIN
LIB
LIB
Feature
Respects list order
Supports data nesting
Avoids query avalanches
Is statically type-checked
Guarantees translation to SQL
Has compositional syntax and semantics
3
Universität Tübingen
State of the Art
Facilities
SQL
LINQ
Links
Haskell DB
Ferry
Category
QLA
LIN
LIN
LIB
LIB
Feature
Respects list order
Supports data nesting
Avoids query avalanches
Is statically type-checked
Guarantees translation to SQL
Has compositional syntax and semantics
take n e ++ drop n e ≠ e
Jeroen Weijers
Dienstag, 7. September 2010
3
Universität Tübingen
State of the Art
Facilities
SQL
LINQ
Links
Haskell DB
Ferry
Feature
Respects list order
Supports data nesting
Avoids query avalanches
Is statically type-checked
Guarantees translation to SQL
Has compositional syntax and semantics
Category
QLA
LIN
LIN
LIB
LIB
#Queries
10.000
1.000
100
10
1
10
100
1000 10000
Database size
Jeroen Weijers
Dienstag, 7. September 2010
3
Universität Tübingen
State of the Art
Facilities
SQL
LINQ
Links
Haskell DB
Ferry
Feature
Respects list order
Supports data nesting
Avoids query avalanches
Is statically type-checked
Guarantees translation to SQL
Has compositional syntax and semantics
Category
QLA
LIN
LIN
LIB
LIB
#Queries
10.000
1.000
100
10
1
10
100
1000 10000
Database size
Jeroen Weijers
Dienstag, 7. September 2010
3
Universität Tübingen
State of the Art
Facilities
SQL
LINQ
Links
Haskell DB
Ferry
Jeroen Weijers
Dienstag, 7. September 2010
Category
QLA
LIN
LIN
LIB
LIB
Feature
Respects list order
Supports data nesting
Avoids query avalanches
Is statically type-checked
Guarantees translation to SQL
Has compositional syntax and semantics
3
Universität Tübingen
Stick with Comprehensions
What features can a facility have in a category?
hasFeatures :: String ! [String] hasFeatures f = [ feat | (fac,feat) ← features,fac ≡ f ]
means :: String ! String means f = head [ mean | (feat,mean) ← meanings,feat ≡ f ]
query :: [(String , [String])] query =
[ (the cat, nub $ concat $ map (map means ◦ hasFeatures) fac) | (fac, cat) ← facilities, then group by cat ]
Jeroen Weijers
Dienstag, 7. September 2010
4
Universität Tübingen
Stick with Comprehensions
What features can a facility have in a category?
hasFeatures :: String ! [String] hasFeatures f = [ feat | (fac,feat) ← features,fac ≡ f ]
means :: String ! String means f = head [ mean | (feat,mean) ← meanings,feat ≡ f ]
query :: [(String , [String])] query =
[ (the cat, nub $ concat $ map (map means ◦ hasFeatures) fac) | (fac, cat) ← facilities, then group by cat ]
Jeroen Weijers
Dienstag, 7. September 2010
4
Universität Tübingen
Stick with Comprehensions
What features can a facility have in a category?
hasFeatures :: String ! [String] hasFeatures f = [ feat | (fac,feat) ← features,fac ≡ f ]
means :: String ! String means f = head [ mean | (feat,mean) ← meanings,feat ≡ f ]
query :: [(String , [String])] query =
[ (the cat, nub $ concat $ map (map means ◦ hasFeatures) fac) | (fac, cat) ← facilities, then group by cat ]
Jeroen Weijers
Dienstag, 7. September 2010
4
Universität Tübingen
Stick with Comprehensions
What features can a facility have in a category?
hasFeatures :: Q String ! Q [String] hasFeatures f = [$qc| feat | (fac,feat) ← table features,fac ≡ f |]
means :: Q String ! Q String means f = head [$qc| mean | (feat,mean) ← table meanings,feat ≡ f |]
query :: IO [(String , [String])] query = fromQ connection
[$qc| (the cat, nub $ concat $ map (map means ◦ hasFeatures) fac) | (fac, cat) ← table facilities, then group by cat |]
Jeroen Weijers
Dienstag, 7. September 2010
4
Universität Tübingen
Stick with Combinators
map, filter, head, tail, length, zip, sortWith, the, ...
Jeroen Weijers
Dienstag, 7. September 2010
5
Universität Tübingen
Stick with Combinators
map, filter, head, tail, length, zip, sortWith, the, ...
Map from prelude:
map :: (a ! b) ! [a] ! [b]
Jeroen Weijers
Dienstag, 7. September 2010
5
Universität Tübingen
Stick with Combinators
map, filter, head, tail, length, zip, sortWith, the, ...
Map from prelude:
map :: (a ! b) ! [a] ! [b]
Map from Ferry:
map :: (QA a, QA b) (Q a ! Q b) ! Q [a] ! Q [b]
Jeroen Weijers
Dienstag, 7. September 2010
5
Universität Tübingen
Stick with Combinators
map, filter, head, tail, length, zip, sortWith, the, ...
Map from prelude:
map :: (a ! b) ! [a] ! [b]
Map from Ferry:
map :: (QA a, QA b) (Q a ! Q b) ! Q [a] ! Q [b]
Restrict to
supported
queryable types
Jeroen Weijers
Dienstag, 7. September 2010
5
Universität Tübingen
Stick with Combinators
map, filter, head, tail, length, zip, sortWith, the, ...
Map from prelude:
map :: (a ! b) ! [a] ! [b]
Map from Ferry:
map :: (QA a, QA b) (Q a ! Q b) ! Q [a] ! Q [b]
Restrict to
supported
queryable types
Jeroen Weijers
Dienstag, 7. September 2010
Q datatype
Q datatype
builds
represents
query
the query
5
Universität Tübingen
Stick with Combinators
map, filter, head, tail, length, zip, sortWith, the, ...
Map from prelude:
map :: (a ! b) ! [a] ! [b]
Map from Ferry:
map :: (QA a, QA b) (Q a ! Q b) ! Q [a] ! Q [b]
Restrict to
supported
queryable types
Q datatype
Q datatype
builds
represents
query
the query
A few combinators are not supported (yet) (e.g. foldr, foldl)
Jeroen Weijers
Dienstag, 7. September 2010
5
Universität Tübingen
A Haskell View of the
Relational Data Model
CREATE TABLE "Facilities" (facility varchar(100) NOT NULL,
category varchar(100) NOT NULL);
data Facility = Facility
{facility :: String, category :: String}
type Facilities = [Facility]
Jeroen Weijers
Dienstag, 7. September 2010
6
Universität Tübingen
A Haskell View of the
Relational Data Model
CREATE TABLE "Facilities" (facility varchar(100) NOT NULL,
category varchar(100) NOT NULL);
data Facility = Facility
{facility :: String, category :: String}
type Facilities = [Facility]
Int, Bool, String, Double, [a], (), (a1,...,an), {x1::a1,..., xn::an}
Jeroen Weijers
Dienstag, 7. September 2010
6
Universität Tübingen
Turning Haskell into SQL
Compile time
Run time
Heap
List
Comprehensions
Haskell
Combinators
Table
Algebra
SQL
Queries
DB
Tabular
result
Value
Jeroen Weijers
Dienstag, 7. September 2010
7
Universität Tübingen
Avalanche safety
LINQ/HaskellDB
Ferry
Jeroen Weijers
Dienstag, 7. September 2010
8
Universität Tübingen
10.000
Avalanche safety
# Queries
1.000
100
10
LINQ/HaskellDB
Ferry
1
10
Jeroen Weijers
Dienstag, 7. September 2010
100
1000
Database size
8
10000
Universität Tübingen
10.000
Avalanche safety
# Queries
1.000
100
10
LINQ/HaskellDB
Ferry
1
10
Jeroen Weijers
Dienstag, 7. September 2010
100
1000
Database size
8
10000
Universität Tübingen
10.000
Avalanche safety
# Queries
1.000
100
10
LINQ/HaskellDB
Ferry
1
10
Jeroen Weijers
Dienstag, 7. September 2010
100
1000
Database size
8
10000
Universität Tübingen
Avalanche safety
10.000
LINQ/HaskellDB
Ferry
# Queries
1.000
100
10
1
10
100
1000
10000
Database size
Program’s result type determines # of queries
[(String,[String])]
Jeroen Weijers
Dienstag, 7. September 2010
8
Universität Tübingen
Avalanche safety
10.000
LINQ/HaskellDB
Ferry
# Queries
1.000
100
10
1
10
100
1000
10000
Database size
Program’s result type determines # of queries
[(String,[String])]
Jeroen Weijers
Dienstag, 7. September 2010
8
Universität Tübingen
Avalanche safety
10.000
LINQ/HaskellDB
Ferry
# Queries
1.000
100
10
1
10
100
1000
10000
Database size
Program’s result type determines # of queries
[[]]
SQL
Jeroen Weijers
Dienstag, 7. September 2010
SQL
8
Universität Tübingen
Avalanche safety
10.000
LINQ/HaskellDB
Ferry
# Queries
1.000
100
10
1
10
100
1000
10000
Database size
Program’s result type determines # of queries
[[]]
SQL
Jeroen Weijers
Dienstag, 7. September 2010
SQL
Queries in a bundle are independent:
concurrent / on-demand execution OK
8
Universität Tübingen
Future work
•
Functions as result of subquery
•
User-defined data types
•
Admit (limited) recursive functions
•
Even tighter integration with Haskell
•
Explore the many similarities with
Data Parallel Haskell (DPH)
Jeroen Weijers
Dienstag, 7. September 2010
9
Universität Tübingen
Haskell on a Ferry
•
The database as coprocessor
•
Familiar Haskell syntax and semantics
•
Support nested data and ordered lists
•
Guaranteed avalanche safety
Jeroen Weijers
Dienstag, 7. September 2010
10
Universität Tübingen
Haskell turns into SQL
Runtime Heap
Haskell Program
Result
Database System
Input Data
Jeroen Weijers
Dienstag, 7. September 2010
11
Universität Tübingen
Haskell turns into SQL
Runtime Heap
Result
Database System
SQL Queries
Input Data
Jeroen Weijers
Dienstag, 7. September 2010
11
Universität Tübingen
Haskell turns into SQL
Runtime Heap
Result
Database System
SQL Queries
Input Data
Jeroen Weijers
Dienstag, 7. September 2010
11
Universität Tübingen
Haskell turns into SQL
Runtime Heap
Result
Database System
SQL Queries
Input Data
www.ferry-lang.org
Jeroen Weijers
Dienstag, 7. September 2010
11
Universität Tübingen
Bonus Slides
The following slides were not part
of the original presentation.
Jeroen Weijers
Dienstag, 7. September 2010
12
Universität Tübingen
Similarities with DPH
Ferry
[:(a,b):] is represented by
([:a:],[:b:]) where array
length is synchronized
Fields of a tuple live in
separate, adjacent columns
Jeroen Weijers
Dienstag, 7. September 2010
pos
1
2
3
val1
"A"
"B"
"C"
⋮
⋮
DPH
val2
10
20
30
[:"A",
"B",
"C",
...
:]
13
[:10,
20,
30,
...
:]
Universität Tübingen
Ferry
Similarities with DPH
Foreign keys are descriptors
for nested lists
Nested arrays lead to
offset/length descriptors
[[1,2],[],[3],[4,5,6]]
iter pos val
1
1 ▢
1
2 ▥
1
3 ▤
1
4 ▨
Jeroen Weijers
Dienstag, 7. September 2010
box
▢
▢
▤
▨
▨
▨
DPH
[: [:1,2:],[::],[:3:],
[:4,5,6:] :]
pos
1
2
1
1
2
3
val
1
2
3
4
5
6
[:0,
2,
2,
3
:]
14
[:2,
0,
1,
3
:]
[:1,
2,
3,
4,
5,
6
:]
Universität Tübingen
Ferry
Similarities with DPH
To evaluate iteration in parallel,
expressions are lifted
To evaluate iteration in parallel,
expressions are lifted
[$qc| x+15 |
x <-­‐ toQ [1,90,4]|]
iter
1
2
3
Jeroen Weijers
Dienstag, 7. September 2010
×
DPH
[: x+15 | x <-­‐ [:1,90,4:] :]
val
15
replicateP
(lengthP [:1,90,4:])
15
15
Universität Tübingen
Ferry
Similarities with DPH
DPH
Compilation targets
machines with
data-­‐parallel primitives
Compilation targets
vector primitives of
modern CPU architectures
σ π ⋈
fst^
snd^
*^
sumP
bpermuteP
Jeroen Weijers
Dienstag, 7. September 2010
16
Universität Tübingen
Ferry
Similarities with DPH
DPH
Sample relational plan
Sample DPH Core
svMul sv v = [$qc| f*(v!i) | (i,f) <-­‐ sv |]
svMul sv v =
[: f*(v!i) | (i,f) <-­‐ sv :]
svMul toQ [(1,10),...] toQ [1,90,4,...]
svMul [:(1,10),...:] [:1,90,4,...:]
π
pos1,val:val2*val3
snd^ sv
*^
bpermuteP v (fst^ sv)
⋈
val1 = pos2
pos1 val1 val2
1
1 10
2
3 30
Jeroen Weijers
Dienstag, 7. September 2010
pos val3
2
1
1
2 90
3
4
17
Universität Tübingen
Embedding internals
Feature
List Comprehension
Type Correctness
Restricting types
Pattern matching
Algebraic code
Boilerplate code
Jeroen Weijers
Dienstag, 7. September 2010
Implementation technique
Quasi Quoting
ADTs and Phantom Types
Type classes
View patterns
Combinators
Template Haskell
18
Universität Tübingen
List comprehensions
hasFeatures f = [$qc| feat | (fac,feat) ← table features,fac ≡ f |]
Quasi Quoter
hasFeatures f = map (λ(fac, feat) ! feat) $ filter (λ(fac,feat) ! fac ≡ f) (table features)
Jeroen Weijers
Dienstag, 7. September 2010
19
Universität Tübingen
Internal datatype
data Exp = VarE String | UnitE | BoolE Bool | CharE Char | IntE Int | TupleE Exp Exp [Exp] | ListE [Exp] | FuncE (Exp ! Exp) | AppE Exp Exp | TableE String Type
data Q a = Q Exp
Jeroen Weijers
Dienstag, 7. September 2010
20
Universität Tübingen
Restricting types
class QA a where toQ :: a ! Q a fromQ :: Conn ! Q a ! IO a
instance QA Int where
toQ i = IntE i
fromQ c (IntE i) = ...
class QA a TA a where table :: String ! Q [a] table = ...
Jeroen Weijers
Dienstag, 7. September 2010
21
instance TA Int where
instance TA Bool where
Universität Tübingen
Pattern matching
Pattern matching on QA data:
Tuples: (λ(view ! (a,b)) ! ...)
Nested tuples: (λ(view ! (a,(view -­‐> (b,c)) ! ...)
Records: (λ(view ! (UserV {name, id}) ! ...)
Jeroen Weijers
Dienstag, 7. September 2010
22
Universität Tübingen
Targeting SQL:99 DBMSs
πa Projection
pσp Selection
✶ Join
××Cross Product
δ Duplicate Elimination
@
�a
Constant Column Attachment
a b
c 1 c2
Literal Table Construction
g operator �
agg
Row Ranking
Aggregation
R Database Table Access
Jeroen Weijers
Dienstag, 7. September 2010
23
Universität Tübingen
Algebraic code
Q1
Q2
item1
order by item1
nest:item1 ,
item1 :item6
order by nest,item1
δ
δ
πitem1 :cat
πitem1 ,
facilities
item6
� item5 =item4
πitem5 :feature,
item6 :meaning
meanings
� item2 =item3
πitem1 :cat,
item2 :fac
facilities
πitem3 :fac,
item4 :feature
features
Jeroen
Weijers
Universität Tübingen
24 Q1,2 implementing Program
10.
Optimized
algebraic plan bundle
P.
Dienstag, 7. September 2010
SQL code
SELECT DISTINCT t0000.cat AS item1 FROM facilities AS t0000
ORDER BY t0000.cat ASC;
SELECT DISTINCT t0001.cat AS nest, t0000.meaning AS item1 FROM meanings AS t0000,
facilities AS t0001,
features AS t0002 WHERE t0000.feature = t0002.feature AND
t0001.fac = t0002.fac ORDER BY t0001.cat ASC, t0000.meaning ASC;
Jeroen Weijers
Dienstag, 7. September 2010
25
Universität Tübingen
Boilerplate code
QA instance for tuples are generated:
$(deriveTupleQA 3)
Template Haskell
instance (QA a, QA b, QA c) QA (a,b,c) where
...
QA instances automatically derived for:
•Tuples
•Database tables
Jeroen Weijers
Dienstag, 7. September 2010
26
Universität Tübingen
Example results
Features facilities can have in a category:
[("QLA", ["avoids query avalanches",
"guarantees translation to SQL", "is statically type-­‐checked"]),
("LIN", ["guarantees translation to SQL",
"has compositional syntax and semantics", "is statically type-­‐checked", "supports data nesting"]),
("LIB", ["Respects list order",
"Supports data nesting", "Avoids query avalanches", "is statically type-­‐checked", "guarantees translation to SQL", "has compositional syntax and semantics"])
]
Jeroen Weijers
Dienstag, 7. September 2010
27
Universität Tübingen
Zugehörige Unterlagen
Herunterladen