Presto - SLAC Conferences, Workshops and … · 3 What is Presto? •Open source distributed SQL...
Transcript of Presto - SLAC Conferences, Workshops and … · 3 What is Presto? •Open source distributed SQL...
2
3
What is Presto?
• Open source distributed SQL query engine
• Designed and written from the ground up for interactive analytical queries
• Scales to the sizes needed by organizations like Facebook
• Query data where it lives
• Hadoop Distribution agnostic
• Extensible
4
• Horizontal scale out
• Query execution is pipelined throughout memory
• Vectorized columnar processing
• Optimized data source readers (e.g. ORC)
• Presto is written in highly tuned Java
– Efficient in-memory data structures
– Very careful coding of inner loops
– Bytecode generation
Presto = Performance
5
Presto Architecture
Data stream API
Worker
Data stream API
Worker
Coordinator
Metadata
API
Parser/
analyzer Planner Scheduler
Worker
Client
Data location
API
Pluggable
6
Presto Extensibility – Connectors
Parser/
analyzer Planner
Worker
Data location API
Hiv
e
Ca
ssa
nd
ra
Ka
fka
MySQ
L
…
Metadata API
Hiv
e
Ca
ssa
nd
ra
Ka
fka
MySQ
L
…
Data stream API
Hiv
e
Ca
ssa
nd
ra
Ka
fka
MySQ
L
…
Scheduler
Coordinator
7
Presto Connectors
Teradata QueryGrid™
Targets
Entry Points
TERADATA DATABASE
ASTER ANALYTICS
PRESTO HADOOP
HIVE / HDFS
HADOOP
OTHER DATABASE
S
NOSQL DATABASE
S
TERADATA DATABASE
ASTER ANALYTICS
PRESTO HADOOP
Non-Relational DBs Multi-Genre Advanced Analytics™
Integrated Data Warehouses
3rd Party Relational DBs
Multiple Hadoop SQL Query Engines and Distributions
APACHE KAFKA
APACHE CASSANDRA
MYSQL POSTGRES PRESTO API AMAZON S3 AMAZON S3
8
Presto-to-Teradata & Teradata-to-Presto
9
• 100% open source contributions to Presto to increase adoption in the enterprise
• A multi-year roadmap commitment to enhancements of the open source code
• The first ever commercial support offering for Presto
• Providing ODBC / JDBC drivers to the community
• Driving Business Intelligence tool integrations
• Query Grid connectivity
Teradata & Presto
10
• Github, Presto users group, IRC, Twitter, Facebook
• https://prestodb.io/community.html
• https://github.com/prestodb
• https://github.com/Teradata
Contributing!
11
• Let’s make a connector for scientific data! e.g. https://root.cern.ch/doc/master/classTFile.html
Presto to Query Scientific Data?
12 12