LARA: A Language of Linear and Relational Algebra
for PolystoresDylan Hutchison
advised by Bill Howe, Dan Suciu
- Work in Progress -
Polystores
Table Store
Graph Engine
Array Store
Key-Value Store
MatlabSQL Spark Streaming DataFrames
Polystores connect backend systems with frontend languages through a unifying "narrow API," using each
system where it performs best.
How to choose an algebra?
Goal: Implement algorithms!
Algorithms
Data CubeMatrix Inverse
Max Flow PageRank
How to choose an algebra?
Ops:
Objects:
Algorithms
Data CubeMatrix Inverse
Max Flow PageRank
Goal: Implement algorithms!
Algebras
Relations Matrices Graphs Files
BLAS/Linear Algebra
Node/Edge Updates
File AccessRelational
Algebra
Many candidate algebras…
Algebra := Objects + (closed) Operations on Objects
Ops:
Objects:
How to choose an algebra?
Algorithms
Data CubeMatrix Inverse
Max Flow PageRank
Goal: Implement algorithms!
Algebras
Relations Matrices Graphs Files
BLAS/Linear Algebra
Node/Edge Updates
File AccessRelational
Algebra
Many candidate algebras…
Execution Engines
PostgreSQLNeo4J, Allegro
CSV, HDF5
Algebra := Objects + (closed) Operations on Objects
Many algebras have optimized execution engines
ScaLAPACK
LARA
Ops:
Objects:
How to choose an algebra?Algorithms
Data CubeMatrix Inverse
Max Flow PageRank
Algebras
Relations Matrices Graphs Files
BLAS/Linear Algebra
Node/Edge Updates
File AccessRelational
Algebra
Execution Engines
PostgreSQLNeo4J, Allegro
CSV, HDF5ScaLAPACK
Associative Tables
⋈⊗ mapf promoteV
⋈
⊕
Answer: No choice necessary. Use Lara!1. Write algorithm in any/all algebras2. Translate to/from Lara common algebra3. Use any/all execution engines
Goal: Implement algorithms!
Operations of Lara
• ⋈⊗ – Join: horizontally merge columns, select equal colliding keys, multiply colliding values
• ⊕ – Union: vertically merge columns, group by colliding keys, sum colliding values
• mapf – Map keys and old values to new values
• promoteV – Promote values to keys
⋈
Example: Ranking a SearchSuppose a user enters the search term "green delicious", as in input Q. Database D scoring sites with search term relevance. Table W weighs words by importance.Goal: Compute ranks of sites in D for search query Q, weighing by W
Q
word score
delicious 1
green 1
(others) 0
D
site word score
pizzanow.com pizza 6
pizzanow.com delicious 5
allrecipes.com delicious 2
allrecipes.com green 2
allrecipes.com potatoes 5
recycle.org green 2
(others) 0
W
word score
delicious 1
pizza 1
potatoes 3
green 2
(others) 0
Desired Output
site score
pizzanow.com 1*5*1 = 5
allrecipes.com 1*2*1+1*2*2 = 6
recycle.org 1*2*2 = 4
(others) 0
Example: Ranking a Search
Q
word score
delicious 1
green 1
(others) 0
D
site word score
pizzanow.com pizza 6
pizzanow.com delicious 5
allrecipes.com delicious 2
allrecipes.com green 2
allrecipes.com potatoes 5
recycle.org green 2
(others) 0
W
word score
delicious 1
pizza 1
potatoes 3
green 2
(others) 0
Suppose a user enters the search term "green delicious", as in input Q. Database D scoring sites with search term relevance. Table W weighs words by importance.Goal: Compute ranks of sites in D for search query Q, weighing by W
RA: γsite, +(score)(πsite, word, (score*score') as score(πword(Q) ⋈ D ⋈ ρscorescore'(W)))
LA: diag(Q) +.* D +.* W
Desired Output
site score
pizzanow.com 1*5*1 = 5
allrecipes.com 1*2*1+1*2*2 = 6
recycle.org 1*2*2 = 4
(others) 0
Example: Ranking a Search
Q
word score
delicious 1
green 1
(others) 0
D
site word score
pizzanow.com pizza 6
pizzanow.com delicious 5
allrecipes.com delicious 2
allrecipes.com green 2
allrecipes.com potatoes 5
recycle.org green 2
(others) 0
W
word score
delicious 1
pizza 1
potatoes 3
green 2
(others) 0
Suppose a user enters the search term "green delicious", as in input Q. Database D scoring sites with search term relevance. Table W weighs words by importance.Goal: Compute ranks of sites in D for search query Q, weighing by W
RA: γsite, +(score)(πsite, word, (score*score') as score(πword(Q) ⋈ D ⋈ ρscorescore'(W)))
LA: diag(Q) +.* D +.* W
Desired Output
site score
pizzanow.com 1*5*1 = 5
allrecipes.com 1*2*1+1*2*2 = 6
recycle.org 1*2*2 = 4
(others) 0
(Matlab)
Example: Ranking a Search
Q
word score
delicious 1
green 1
(others) 0
D
site word score
pizzanow.com pizza 6
pizzanow.com delicious 5
allrecipes.com delicious 2
allrecipes.com green 2
allrecipes.com potatoes 5
recycle.org green 2
(others) 0
W
word score
delicious 1
pizza 1
potatoes 3
green 2
(others) 0
Suppose a user enters the search term "green delicious", as in input Q. Database D scoring sites with search term relevance. Table W weighs words by importance.Goal: Compute ranks of sites in D for search query Q, weighing by W
RA: γsite, +(score)(πsite, word, (score*score') as score(πword(Q) ⋈ D ⋈ ρscorescore'(W)))
LA: diag(Q) +.* D +.* W
Hybrid: πword(Q) ⋈ D +.* W
Desired Output
site score
pizzanow.com 1*5*1 = 5
allrecipes.com 1*2*1+1*2*2 = 6
recycle.org 1*2*2 = 4
(others) 0
(Matlab)
Example: Ranking a Search
Q
word score
delicious 1
green 1
(others) 0
D
site word score
pizzanow.com pizza 6
pizzanow.com delicious 5
allrecipes.com delicious 2
allrecipes.com green 2
allrecipes.com potatoes 5
recycle.org green 2
(others) 0
W
word score
delicious 1
pizza 1
potatoes 3
green 2
(others) 0
Suppose a user enters the search term "green delicious", as in input Q. Database D scoring sites with search term relevance. Table W weighs words by importance.Goal: Compute ranks of sites in D for search query Q, weighing by W
RA: γsite, +(score)(πsite, word, (score*score') as score(πword(Q) ⋈ D ⋈ ρscorescore'(W)))
LA: diag(Q) +.* D +.* W
Hybrid: πword(Q) ⋈ D +.* W
LARA: (Q ⋈* D ⋈* W) Esite
Desired Output
site score
pizzanow.com 1*5*1 = 5
allrecipes.com 1*2*1+1*2*2 = 6
recycle.org 1*2*2 = 4
(others) 0
⋈
+
(Matlab)
Example: Ranking a Search
Q
word score
delicious 1
green 1
(others) 0
D
site word score
pizzanow.com pizza 6
pizzanow.com delicious 5
allrecipes.com delicious 2
allrecipes.com green 2
allrecipes.com potatoes 5
recycle.org green 2
(others) 0
W
word score
delicious 1
pizza 1
potatoes 3
green 2
(others) 0
Suppose a user enters the search term "green delicious", as in input Q. Database D scoring sites with search term relevance. Table W weighs words by importance.Goal: Compute ranks of sites in D for search query Q, weighing by W
RA: γsite, +(score)(πsite, word, (score*score') as score(πword(Q) ⋈ D ⋈ ρscorescore'(W)))
LA: diag(Q) +.* D +.* W
Hybrid: πword(Q) ⋈ D +.* W
LARA: (Q ⋈* D ⋈* W) Esite
Desired Output
site score
pizzanow.com 1*5*1 = 5
allrecipes.com 1*2*1+1*2*2 = 6
recycle.org 1*2*2 = 4
(others) 0
⋈
+
Executes on both RDBMS and BLAS, depending on cost model
Many ways to express algorithms. Lara presents an economical algebra preserving • LA's familiar math, numerical prowess• RA's flexibility, scale-out optimization
(Matlab)
LARA: A Unifying Algebra
Do you have an application more easily expressed in several algebras?
Do you seek multi-system optimizations?
Let's discuss!
☺
Vision for Polystore Systems
ScriptSQLSQLSQLMatlabMatlabMatlabSQLSQL…
∪ × πC
σf ρ
∖ γ
⊕⊗ f⊕.⊗ T
⋈⊗
mapf
promoteV
LA
RA
RA
RDBMS
BLAS
Optimize &
Schedule
⋈
⊕
LARA
APIs of RA and LA
Relational Algebra
Object: Relation
• ∪ – Union
• × – Cartesian Product
• πC – (Extended) Projection
• σf – Select
• ρ – Rename
• ∖ – Difference
• γ – Aggregate
Linear Algebra
Object: N-D Matrix
•⊕ – Element-wise add
•⊗ – Element-wise multiply
•⊕.⊗ – Matrix multiply
• Reduce – Sum along a dimension
• Apply function to each element
• T – Transpose
• (Construction & De-construction)
Objects of Lara
Associative Tables. Several interpretations:
• Relational table with key columns & value columns with default values
• Total function from key-space to value-space
• Sparse tensor
Lara -> RA & LA
Lara RA LA
⋈⊗ ⋈, π⊗, ρ Tensor product
γ⊕, ∪ Reduce, e-wise sum
mapf πf Apply
promoteV Re-index Re-key
⋈
⊕
Example derived operation: Outer Join
InnerJoinP ⋈ S
P ⟗ S
(formulas out of date)
Top Related