Presto - SLAC Conferences, Workshops and … · 3 What is Presto? •Open source distributed SQL...

12
1 © 2015 Teradata Presto XLDB Lightning Talk – 5/24/2016 [email protected]

Transcript of Presto - SLAC Conferences, Workshops and … · 3 What is Presto? •Open source distributed SQL...

Page 1: Presto - SLAC Conferences, Workshops and … · 3 What is Presto? •Open source distributed SQL query engine •Designed and written from the ground up for interactive analytical

1 © 2015 Teradata

Presto XLDB Lightning Talk – 5/24/2016

[email protected]

Page 2: Presto - SLAC Conferences, Workshops and … · 3 What is Presto? •Open source distributed SQL query engine •Designed and written from the ground up for interactive analytical

2

Page 3: Presto - SLAC Conferences, Workshops and … · 3 What is Presto? •Open source distributed SQL query engine •Designed and written from the ground up for interactive analytical

3

What is Presto?

• Open source distributed SQL query engine

• Designed and written from the ground up for interactive analytical queries

• Scales to the sizes needed by organizations like Facebook

• Query data where it lives

• Hadoop Distribution agnostic

• Extensible

Page 4: Presto - SLAC Conferences, Workshops and … · 3 What is Presto? •Open source distributed SQL query engine •Designed and written from the ground up for interactive analytical

4

• Horizontal scale out

• Query execution is pipelined throughout memory

• Vectorized columnar processing

• Optimized data source readers (e.g. ORC)

• Presto is written in highly tuned Java

– Efficient in-memory data structures

– Very careful coding of inner loops

– Bytecode generation

Presto = Performance

Page 5: Presto - SLAC Conferences, Workshops and … · 3 What is Presto? •Open source distributed SQL query engine •Designed and written from the ground up for interactive analytical

5

Presto Architecture

Data stream API

Worker

Data stream API

Worker

Coordinator

Metadata

API

Parser/

analyzer Planner Scheduler

Worker

Client

Data location

API

Pluggable

Page 6: Presto - SLAC Conferences, Workshops and … · 3 What is Presto? •Open source distributed SQL query engine •Designed and written from the ground up for interactive analytical

6

Presto Extensibility – Connectors

Parser/

analyzer Planner

Worker

Data location API

Hiv

e

Ca

ssa

nd

ra

Ka

fka

MySQ

L

Metadata API

Hiv

e

Ca

ssa

nd

ra

Ka

fka

MySQ

L

Data stream API

Hiv

e

Ca

ssa

nd

ra

Ka

fka

MySQ

L

Scheduler

Coordinator

Page 7: Presto - SLAC Conferences, Workshops and … · 3 What is Presto? •Open source distributed SQL query engine •Designed and written from the ground up for interactive analytical

7

Presto Connectors

Teradata QueryGrid™

Targets

Entry Points

TERADATA DATABASE

ASTER ANALYTICS

PRESTO HADOOP

HIVE / HDFS

HADOOP

OTHER DATABASE

S

NOSQL DATABASE

S

TERADATA DATABASE

ASTER ANALYTICS

PRESTO HADOOP

Non-Relational DBs Multi-Genre Advanced Analytics™

Integrated Data Warehouses

3rd Party Relational DBs

Multiple Hadoop SQL Query Engines and Distributions

APACHE KAFKA

APACHE CASSANDRA

MYSQL POSTGRES PRESTO API AMAZON S3 AMAZON S3

Page 8: Presto - SLAC Conferences, Workshops and … · 3 What is Presto? •Open source distributed SQL query engine •Designed and written from the ground up for interactive analytical

8

Presto-to-Teradata & Teradata-to-Presto

Page 9: Presto - SLAC Conferences, Workshops and … · 3 What is Presto? •Open source distributed SQL query engine •Designed and written from the ground up for interactive analytical

9

• 100% open source contributions to Presto to increase adoption in the enterprise

• A multi-year roadmap commitment to enhancements of the open source code

• The first ever commercial support offering for Presto

• Providing ODBC / JDBC drivers to the community

• Driving Business Intelligence tool integrations

• Query Grid connectivity

Teradata & Presto

Page 10: Presto - SLAC Conferences, Workshops and … · 3 What is Presto? •Open source distributed SQL query engine •Designed and written from the ground up for interactive analytical

10

• Github, Presto users group, IRC, Twitter, Facebook

• https://prestodb.io/community.html

• https://github.com/prestodb

• https://github.com/Teradata

Contributing!

Page 11: Presto - SLAC Conferences, Workshops and … · 3 What is Presto? •Open source distributed SQL query engine •Designed and written from the ground up for interactive analytical

11

• Let’s make a connector for scientific data! e.g. https://root.cern.ch/doc/master/classTFile.html

Presto to Query Scientific Data?

Page 12: Presto - SLAC Conferences, Workshops and … · 3 What is Presto? •Open source distributed SQL query engine •Designed and written from the ground up for interactive analytical

12 12