Download - Presto Meetup @ Facebook (2014-05-14)

Page 1: Presto Meetup @ Facebook (2014-05-14)

PrestoPast, Present, and Future

Dain Sundstrom

Page 2: Presto Meetup @ Facebook (2014-05-14)


Page 3: Presto Meetup @ Facebook (2014-05-14)

By The Numbers▪6 months▪15 releases▪30 contributors▪662 commits▪1406 files changed▪130,305 insertions(+) 43,699 deletions(-)

Page 4: Presto Meetup @ Facebook (2014-05-14)

New SQL Features▪Create table▪Distinct aggregations▪Cross joins▪Custom functions

Page 5: Presto Meetup @ Facebook (2014-05-14)

Optimizations▪Range predicate push down▪Distributed aggregations▪Distributed window functions▪Distinct-limit optimization▪Approximate queries

Page 6: Presto Meetup @ Facebook (2014-05-14)

Type System▪Plugins can add new scalar types▪Extensible operators ▪DATE, TIME, TIMESTAMP and INTERVAL▪Time zones with DST rules▪Localized parse and format▪HyperLogLog type

Page 7: Presto Meetup @ Facebook (2014-05-14)

New Connectors▪Hadoop 1.x▪Hadoop 2.x▪CDH 5▪Custom S3 integration for Hadoop▪Cassandra▪TPC-H

Page 8: Presto Meetup @ Facebook (2014-05-14)

SELECT now()

Page 9: Presto Meetup @ Facebook (2014-05-14)

Hive 0.13 Support▪New file formats▪ORC▪Parquet▪DWRF▪Vectorized ORC (2-3x more efficient)▪ORC stripe skipping

Page 10: Presto Meetup @ Facebook (2014-05-14)

Index Joins▪Targeting low cardinality joins▪Lazy hash build▪Predicate push down▪Aggregation push down▪Initial version in already checked in▪Currently supported in HBase and MySQL

Page 11: Presto Meetup @ Facebook (2014-05-14)

Connectors▪HBase▪Requires features in Facebook HBase▪Index joins▪JDBC (MySQL)▪Sharding ▪Index joins

Page 12: Presto Meetup @ Facebook (2014-05-14)

Views▪Create/drop views▪View definition stored in connector▪Fully optimized by Presto▪Views stored in Presto syntax▪Not compatible with existing Hive views

Page 13: Presto Meetup @ Facebook (2014-05-14)

Machine Learning▪Supports classification and regression▪Multiple algorithms (Currently only SVM)▪Feature extraction and normalization▪New functions and types▪Possibly extend SQL grammar▪Highly experimental

Page 14: Presto Meetup @ Facebook (2014-05-14)

Continuous Integration▪Continuous correctness testing▪Run queries against prod and trunk▪Continuous benchmark▪Run full test suite with every connector

▪Faster release cycle

Page 15: Presto Meetup @ Facebook (2014-05-14)


Page 16: Presto Meetup @ Facebook (2014-05-14)

SQL Features▪Structs, Maps and Lists▪Table generating functions▪Scalar sub queries▪Features required to run all TPC-DS▪Create table with partitioning▪Possibly: Insert, delete, drop partition

Page 17: Presto Meetup @ Facebook (2014-05-14)

Execution Engine▪Huge joins and aggregations▪Hash distributed▪Co-distributed and co-partitioned▪Spill to disk (flash)▪Work stealing▪Basic task recovery

Page 18: Presto Meetup @ Facebook (2014-05-14)

Native Store▪Stores data directly on worker nodes▪Uses custom data format▪Initial use cases▪Store for ‘hot’ data▪Store for ‘live’ data▪Support co-distributed data

Page 19: Presto Meetup @ Facebook (2014-05-14)

Security▪Authentication▪Username/password, Kerberos, SSL cert▪Authorization▪Integration with plugins▪Grant permissions from SQL

Page 20: Presto Meetup @ Facebook (2014-05-14)

New REST API▪Prepared statements▪Bound parameters▪Server managed sessions▪Explicit support for non-query (DML/DDL)▪Split query submission, stats, and data fetching

Page 21: Presto Meetup @ Facebook (2014-05-14)

ODBC Driver ▪Targeting major BI tools▪Tableau, MicroStrategy and Excel▪Support for Windows, Mac and Linux▪Will require new REST API▪Written in D▪Entirely open source (ASL2)

Page 22: Presto Meetup @ Facebook (2014-05-14)

Plugins▪Plugin repository▪Manage plugins from CLI▪Function catalogs▪Push down joins and aggregations▪Custom optimizers

Page 23: Presto Meetup @ Facebook (2014-05-14)

SELECT questionFROM audienceWHERE isAwesome(question)

Page 24: Presto Meetup @ Facebook (2014-05-14)

(c) 2007 Facebook, Inc. or its licensors.  "Facebook" is a registered trademark of Facebook, Inc.. All rights reserved. 1.0