Lara: A Language of Linear and Relational Algebra for ... · BLAS/Linear Algebra Node/Edge Updates...

21
LARA: A Language of Linear and Relational Algebra for Polystores Dylan Hutchison advised by Bill Howe, Dan Suciu - Work in Progress -

Transcript of Lara: A Language of Linear and Relational Algebra for ... · BLAS/Linear Algebra Node/Edge Updates...

Page 1: Lara: A Language of Linear and Relational Algebra for ... · BLAS/Linear Algebra Node/Edge Updates File Access Relational Algebra Execution Engines PostgreSQL Neo4J, Allegro ScaLAPACK

LARA: A Language of Linear and Relational Algebra

for PolystoresDylan Hutchison

advised by Bill Howe, Dan Suciu

- Work in Progress -

Page 2: Lara: A Language of Linear and Relational Algebra for ... · BLAS/Linear Algebra Node/Edge Updates File Access Relational Algebra Execution Engines PostgreSQL Neo4J, Allegro ScaLAPACK

Polystores

Table Store

Graph Engine

Array Store

Key-Value Store

MatlabSQL Spark Streaming DataFrames

Polystores connect backend systems with frontend languages through a unifying "narrow API," using each

system where it performs best.

Page 3: Lara: A Language of Linear and Relational Algebra for ... · BLAS/Linear Algebra Node/Edge Updates File Access Relational Algebra Execution Engines PostgreSQL Neo4J, Allegro ScaLAPACK

How to choose an algebra?

Goal: Implement algorithms!

Algorithms

Data CubeMatrix Inverse

Max Flow PageRank

Page 4: Lara: A Language of Linear and Relational Algebra for ... · BLAS/Linear Algebra Node/Edge Updates File Access Relational Algebra Execution Engines PostgreSQL Neo4J, Allegro ScaLAPACK

How to choose an algebra?

Ops:

Objects:

Algorithms

Data CubeMatrix Inverse

Max Flow PageRank

Goal: Implement algorithms!

Algebras

Relations Matrices Graphs Files

BLAS/Linear Algebra

Node/Edge Updates

File AccessRelational

Algebra

Many candidate algebras…

Algebra := Objects + (closed) Operations on Objects

Page 5: Lara: A Language of Linear and Relational Algebra for ... · BLAS/Linear Algebra Node/Edge Updates File Access Relational Algebra Execution Engines PostgreSQL Neo4J, Allegro ScaLAPACK

Ops:

Objects:

How to choose an algebra?

Algorithms

Data CubeMatrix Inverse

Max Flow PageRank

Goal: Implement algorithms!

Algebras

Relations Matrices Graphs Files

BLAS/Linear Algebra

Node/Edge Updates

File AccessRelational

Algebra

Many candidate algebras…

Execution Engines

PostgreSQLNeo4J, Allegro

CSV, HDF5

Algebra := Objects + (closed) Operations on Objects

Many algebras have optimized execution engines

ScaLAPACK

Page 6: Lara: A Language of Linear and Relational Algebra for ... · BLAS/Linear Algebra Node/Edge Updates File Access Relational Algebra Execution Engines PostgreSQL Neo4J, Allegro ScaLAPACK

LARA

Ops:

Objects:

How to choose an algebra?Algorithms

Data CubeMatrix Inverse

Max Flow PageRank

Algebras

Relations Matrices Graphs Files

BLAS/Linear Algebra

Node/Edge Updates

File AccessRelational

Algebra

Execution Engines

PostgreSQLNeo4J, Allegro

CSV, HDF5ScaLAPACK

Associative Tables

⋈⊗ mapf promoteV

Answer: No choice necessary. Use Lara!1. Write algorithm in any/all algebras2. Translate to/from Lara common algebra3. Use any/all execution engines

Goal: Implement algorithms!

Page 7: Lara: A Language of Linear and Relational Algebra for ... · BLAS/Linear Algebra Node/Edge Updates File Access Relational Algebra Execution Engines PostgreSQL Neo4J, Allegro ScaLAPACK

Operations of Lara

• ⋈⊗ – Join: horizontally merge columns, select equal colliding keys, multiply colliding values

• ⊕ – Union: vertically merge columns, group by colliding keys, sum colliding values

• mapf – Map keys and old values to new values

• promoteV – Promote values to keys

Page 8: Lara: A Language of Linear and Relational Algebra for ... · BLAS/Linear Algebra Node/Edge Updates File Access Relational Algebra Execution Engines PostgreSQL Neo4J, Allegro ScaLAPACK

Example: Ranking a SearchSuppose a user enters the search term "green delicious", as in input Q. Database D scoring sites with search term relevance. Table W weighs words by importance.Goal: Compute ranks of sites in D for search query Q, weighing by W

Q

word score

delicious 1

green 1

(others) 0

D

site word score

pizzanow.com pizza 6

pizzanow.com delicious 5

allrecipes.com delicious 2

allrecipes.com green 2

allrecipes.com potatoes 5

recycle.org green 2

(others) 0

W

word score

delicious 1

pizza 1

potatoes 3

green 2

(others) 0

Desired Output

site score

pizzanow.com 1*5*1 = 5

allrecipes.com 1*2*1+1*2*2 = 6

recycle.org 1*2*2 = 4

(others) 0

Page 9: Lara: A Language of Linear and Relational Algebra for ... · BLAS/Linear Algebra Node/Edge Updates File Access Relational Algebra Execution Engines PostgreSQL Neo4J, Allegro ScaLAPACK

Example: Ranking a Search

Q

word score

delicious 1

green 1

(others) 0

D

site word score

pizzanow.com pizza 6

pizzanow.com delicious 5

allrecipes.com delicious 2

allrecipes.com green 2

allrecipes.com potatoes 5

recycle.org green 2

(others) 0

W

word score

delicious 1

pizza 1

potatoes 3

green 2

(others) 0

Suppose a user enters the search term "green delicious", as in input Q. Database D scoring sites with search term relevance. Table W weighs words by importance.Goal: Compute ranks of sites in D for search query Q, weighing by W

RA: γsite, +(score)(πsite, word, (score*score') as score(πword(Q) ⋈ D ⋈ ρscorescore'(W)))

LA: diag(Q) +.* D +.* W

Desired Output

site score

pizzanow.com 1*5*1 = 5

allrecipes.com 1*2*1+1*2*2 = 6

recycle.org 1*2*2 = 4

(others) 0

Page 10: Lara: A Language of Linear and Relational Algebra for ... · BLAS/Linear Algebra Node/Edge Updates File Access Relational Algebra Execution Engines PostgreSQL Neo4J, Allegro ScaLAPACK

Example: Ranking a Search

Q

word score

delicious 1

green 1

(others) 0

D

site word score

pizzanow.com pizza 6

pizzanow.com delicious 5

allrecipes.com delicious 2

allrecipes.com green 2

allrecipes.com potatoes 5

recycle.org green 2

(others) 0

W

word score

delicious 1

pizza 1

potatoes 3

green 2

(others) 0

Suppose a user enters the search term "green delicious", as in input Q. Database D scoring sites with search term relevance. Table W weighs words by importance.Goal: Compute ranks of sites in D for search query Q, weighing by W

RA: γsite, +(score)(πsite, word, (score*score') as score(πword(Q) ⋈ D ⋈ ρscorescore'(W)))

LA: diag(Q) +.* D +.* W

Desired Output

site score

pizzanow.com 1*5*1 = 5

allrecipes.com 1*2*1+1*2*2 = 6

recycle.org 1*2*2 = 4

(others) 0

(Matlab)

Page 11: Lara: A Language of Linear and Relational Algebra for ... · BLAS/Linear Algebra Node/Edge Updates File Access Relational Algebra Execution Engines PostgreSQL Neo4J, Allegro ScaLAPACK

Example: Ranking a Search

Q

word score

delicious 1

green 1

(others) 0

D

site word score

pizzanow.com pizza 6

pizzanow.com delicious 5

allrecipes.com delicious 2

allrecipes.com green 2

allrecipes.com potatoes 5

recycle.org green 2

(others) 0

W

word score

delicious 1

pizza 1

potatoes 3

green 2

(others) 0

Suppose a user enters the search term "green delicious", as in input Q. Database D scoring sites with search term relevance. Table W weighs words by importance.Goal: Compute ranks of sites in D for search query Q, weighing by W

RA: γsite, +(score)(πsite, word, (score*score') as score(πword(Q) ⋈ D ⋈ ρscorescore'(W)))

LA: diag(Q) +.* D +.* W

Hybrid: πword(Q) ⋈ D +.* W

Desired Output

site score

pizzanow.com 1*5*1 = 5

allrecipes.com 1*2*1+1*2*2 = 6

recycle.org 1*2*2 = 4

(others) 0

(Matlab)

Page 12: Lara: A Language of Linear and Relational Algebra for ... · BLAS/Linear Algebra Node/Edge Updates File Access Relational Algebra Execution Engines PostgreSQL Neo4J, Allegro ScaLAPACK

Example: Ranking a Search

Q

word score

delicious 1

green 1

(others) 0

D

site word score

pizzanow.com pizza 6

pizzanow.com delicious 5

allrecipes.com delicious 2

allrecipes.com green 2

allrecipes.com potatoes 5

recycle.org green 2

(others) 0

W

word score

delicious 1

pizza 1

potatoes 3

green 2

(others) 0

Suppose a user enters the search term "green delicious", as in input Q. Database D scoring sites with search term relevance. Table W weighs words by importance.Goal: Compute ranks of sites in D for search query Q, weighing by W

RA: γsite, +(score)(πsite, word, (score*score') as score(πword(Q) ⋈ D ⋈ ρscorescore'(W)))

LA: diag(Q) +.* D +.* W

Hybrid: πword(Q) ⋈ D +.* W

LARA: (Q ⋈* D ⋈* W) Esite

Desired Output

site score

pizzanow.com 1*5*1 = 5

allrecipes.com 1*2*1+1*2*2 = 6

recycle.org 1*2*2 = 4

(others) 0

+

(Matlab)

Page 13: Lara: A Language of Linear and Relational Algebra for ... · BLAS/Linear Algebra Node/Edge Updates File Access Relational Algebra Execution Engines PostgreSQL Neo4J, Allegro ScaLAPACK

Example: Ranking a Search

Q

word score

delicious 1

green 1

(others) 0

D

site word score

pizzanow.com pizza 6

pizzanow.com delicious 5

allrecipes.com delicious 2

allrecipes.com green 2

allrecipes.com potatoes 5

recycle.org green 2

(others) 0

W

word score

delicious 1

pizza 1

potatoes 3

green 2

(others) 0

Suppose a user enters the search term "green delicious", as in input Q. Database D scoring sites with search term relevance. Table W weighs words by importance.Goal: Compute ranks of sites in D for search query Q, weighing by W

RA: γsite, +(score)(πsite, word, (score*score') as score(πword(Q) ⋈ D ⋈ ρscorescore'(W)))

LA: diag(Q) +.* D +.* W

Hybrid: πword(Q) ⋈ D +.* W

LARA: (Q ⋈* D ⋈* W) Esite

Desired Output

site score

pizzanow.com 1*5*1 = 5

allrecipes.com 1*2*1+1*2*2 = 6

recycle.org 1*2*2 = 4

(others) 0

+

Executes on both RDBMS and BLAS, depending on cost model

Many ways to express algorithms. Lara presents an economical algebra preserving • LA's familiar math, numerical prowess• RA's flexibility, scale-out optimization

(Matlab)

Page 14: Lara: A Language of Linear and Relational Algebra for ... · BLAS/Linear Algebra Node/Edge Updates File Access Relational Algebra Execution Engines PostgreSQL Neo4J, Allegro ScaLAPACK

LARA: A Unifying Algebra

Do you have an application more easily expressed in several algebras?

Do you seek multi-system optimizations?

Let's discuss!

Page 15: Lara: A Language of Linear and Relational Algebra for ... · BLAS/Linear Algebra Node/Edge Updates File Access Relational Algebra Execution Engines PostgreSQL Neo4J, Allegro ScaLAPACK
Page 16: Lara: A Language of Linear and Relational Algebra for ... · BLAS/Linear Algebra Node/Edge Updates File Access Relational Algebra Execution Engines PostgreSQL Neo4J, Allegro ScaLAPACK
Page 17: Lara: A Language of Linear and Relational Algebra for ... · BLAS/Linear Algebra Node/Edge Updates File Access Relational Algebra Execution Engines PostgreSQL Neo4J, Allegro ScaLAPACK

Vision for Polystore Systems

ScriptSQLSQLSQLMatlabMatlabMatlabSQLSQL…

∪ × πC

σf ρ

∖ γ

⊕⊗ f⊕.⊗ T

⋈⊗

mapf

promoteV

LA

RA

RA

RDBMS

BLAS

Optimize &

Schedule

LARA

Page 18: Lara: A Language of Linear and Relational Algebra for ... · BLAS/Linear Algebra Node/Edge Updates File Access Relational Algebra Execution Engines PostgreSQL Neo4J, Allegro ScaLAPACK

APIs of RA and LA

Relational Algebra

Object: Relation

• ∪ – Union

• × – Cartesian Product

• πC – (Extended) Projection

• σf – Select

• ρ – Rename

• ∖ – Difference

• γ – Aggregate

Linear Algebra

Object: N-D Matrix

•⊕ – Element-wise add

•⊗ – Element-wise multiply

•⊕.⊗ – Matrix multiply

• Reduce – Sum along a dimension

• Apply function to each element

• T – Transpose

• (Construction & De-construction)

Page 19: Lara: A Language of Linear and Relational Algebra for ... · BLAS/Linear Algebra Node/Edge Updates File Access Relational Algebra Execution Engines PostgreSQL Neo4J, Allegro ScaLAPACK

Objects of Lara

Associative Tables. Several interpretations:

• Relational table with key columns & value columns with default values

• Total function from key-space to value-space

• Sparse tensor

Page 20: Lara: A Language of Linear and Relational Algebra for ... · BLAS/Linear Algebra Node/Edge Updates File Access Relational Algebra Execution Engines PostgreSQL Neo4J, Allegro ScaLAPACK

Lara -> RA & LA

Lara RA LA

⋈⊗ ⋈, π⊗, ρ Tensor product

γ⊕, ∪ Reduce, e-wise sum

mapf πf Apply

promoteV Re-index Re-key

Page 21: Lara: A Language of Linear and Relational Algebra for ... · BLAS/Linear Algebra Node/Edge Updates File Access Relational Algebra Execution Engines PostgreSQL Neo4J, Allegro ScaLAPACK

Example derived operation: Outer Join

InnerJoinP ⋈ S

P ⟗ S

(formulas out of date)