Page 1 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
Enis Soztutar ([email protected])Ankit Singhal ([email protected])
Page 2 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
About Me
Enis Soztutar
Committer and PMC member in Apache HBase, Phoenix, and Hadoop
HBase/Phoenix team @Hortonworks
Twitter @enissoz
Disclaimer: Not a SQL expert!
Page 3 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Outline
PART I – The Past (a.k.a. All the existing stuff) Phoenix the basics Architecture Overview of existing Phoenix features
PART II – The Present (a.k.a. All the recent stuff) Look at recent releases Transactions Phoenix Query Server Other features
PART III – The Future (a.k.a. All the upcoming stuff) Calcite integration Phoenix – Hive
Page 4 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Part I – The PastAll the existing stuff !
Page 5 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Obligatory Slide - Who uses Phoenix
Page 6 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Phoenix – The Basics
• Hope everybody is familiar with HBase• Otherwise you are in the wrong talk!
• What is wrong with pure-HBase?• HBase is a powerful, flexible and extensible “engine”• Too low level• Have to write java code to do anything!
• Phoenix is relational layer over HBase• Also described as a SQL-Skin• Looking more and more like a generic SQL engine
• Why not Hive / Spark SQL / other SQL-over-Hadoop• OTLP versus OLAP• As fast as HBase, 1 ms query, 10K-1M qps
Page 7 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Why SQL?
Page 8 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
From CDK Global slideshttps://phoenix.apache.org/presentations/StrataHadoopWorld.pdf
Page 9 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
HBase Architecture
DataNode
RegionServer 2
T:foo, region:a
T:bar, region:54
T:foo, region:t
Application
HBase client
DataNode
RegionServer 1
T:foo, region:c
T:bar, region:14
T:foo, region:d
DataNode
RegionServer 3
T:bar, region:32
T:foo, region:k
ZooKeeper Quorum
Page 10 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Phoenix Architecture
DataNode
RegionServer 2
T:foo, region:c
T:bar, region:54
T:foo, region:t
Phoenix RPC endpoint
px
px
Application
Phoenix client / JDBC
HBase client
DataNode
RegionServer 1
T:foo, region:c
T:bar, region:14
T:foo, region:d
Phoenix RPC endpoint
px
px
DataNode
RegionServer 3
T:SYSTEM.CATALOG
T:bar, region:32
T:foo, region:k
Phoenix RPC endpoint
px
px
ZooKeeper Quorum
Page 11 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Phoenix Goodies
SQL DataTypesSchemas / DDL / HBase table propertiesComposite Types (Composite Primary Key)Map existing HBase tablesWrite from HBase, read from PhoenixSaltingParallel ScanSkip scanFilter push downStatistics Collection / Guideposts
Page 12 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
DDL Example
CREATE TABLE IF NOT EXISTS METRIC_RECORD ( METRIC_NAME VARCHAR, HOSTNAME VARCHAR, SERVER_TIME UNSIGNED_LONG NOT NULL METRIC_VALUE DOUBLE, … CONSTRAINT pk PRIMARY KEY (METRIC_NAME, HOSTNAME, SERVER_TIME))DATA_BLOCK_ENCODING=’FAST_DIFF', TTL=604800, COMPRESSION=‘SNAPPY’SPLIT ON ('a', 'k', 'm');
Page 13 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
METRIC_NAME HOSTNAME SERVER_TIME METRIC_VALUE
Regionserver.readRequestCount cn011.hortonworks.com 1396743589 92045759
Regionserver.readRequestCount cn011.hortonworks.com 1396767589 93051916
Regionserver.readRequestCount cn011.hortonworks.com …. …
Regionserver.readRequestCount cn012. hortonworks.com 1396743589
….. … … …
Regionserver.wal.bytesWritten cn011.hortonworks.com
Regionserver.wal.bytesWritten …. …. …
SORT ORDER
SO
RT O
RD
ER
HBASE ROW KEY OTHER COLUMNS
Page 14 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Parallel ScanSELECT * FROM METRIC_RECORD;
CLIENT 4-CHUNK PARALLEL 1-WAY FULL SCAN OVER METRIC_RECORD
Region1
Region2
Region3
Region4
Client
RS
3R
S 2
RS
1
scan
scan
scan
scan
Page 15 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Filter push downSELECT * FROM METRIC_RECORD WHERE SERVER_TIME > NOW() - 7;
CLIENT 4-CHUNK PARALLEL 1-WAY FULL SCAN OVER METRIC_RECORD SERVER FILTER BY SERVER_TIME > DATE '2016-04-06 09:09:05.978’
Region1
Region2
Region3
Region4
Client
RS
3R
S 2
RS
1
scan
scan
scan
scan
Server-side Filter
Page 16 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Skip ScanSELECT * FROM METRIC_RECORD WHERE METRIC_NAME LIKE 'abc%' AND HOSTNAME in ('host1’, 'host2');
CLIENT 1-CHUNK PARALLEL 1-WAY SKIP SCAN ON 2 RANGES OVER METRIC_RECORD ['abc','host1'] - ['abd','host2']
Region1
Region2
Region3
Region4
Client
RS
3R
S 2
RS
1
Skip scan
Page 17 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
TopNSELECT * FROM METRIC_RECORD WHERE SERVER_TIME > NOW() - 7 ORDER BY HOSTNAME LIMIT 5;
CLIENT 4-CHUNK PARALLEL 4-WAY FULL SCAN OVER METRIC_RECORD
SERVER FILTER BY SERVER_TIME > …
SERVER TOP 5 ROWS SORTED BY [HOSTNAME]CLIENT MERGE SORT
Region1
Region2
Region3
Region4
Client
RS
3R
S 2
RS
1
scan
scan
scan
scan
Sort by HOSTNAMEReturn only 5 ROWS
Page 18 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
AggregationSELECT METRIC_NAME, HOSTNAME, AVG(METRIC_VALUE)
FROM METRIC_RECORD WHERE SERVER_TIME > NOW() - 7 GROUP BY METRIC_NAME, HOSTNAME ORDER BY METRIC_NAME, HOSTNAME;
CLIENT 4-CHUNK PARALLEL 1-WAY FULL SCAN OVER METRIC_RECORD
SERVER FILTER BY SERVER_TIME > …
SERVER AGGREGATE INTO ORDERED DISTINCT ROWS BY
[METRIC_NAME, HOSTNAME]
CLIENT MERGE SORT
Region1
Region2
Region3
Region4
Client
RS
3R
S 2
RS
1
scan
scan
scan
scan
Return only aggregated data by METRIC_NAME, HOSTNAME
Page 19 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Joins and subqueries in Phoenix
Grammar• Inner, Left, Right, Full outer join, Cross join• Semi-join / Anti-join
Algorithms• Hash-join, sort-merge join• Hash-join table is computed and pushed to each regionserver from client
Optimizations• Predicate push-down• PK-to-FK join optimization• Global index with missing columns• Correlated query rewrite
Page 20 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Joins and subqueries in Phoenix
Phoenix can execute most of TPC-H queries!No nested loop joinWith Calcite support, more improvements soonNo statistical Guided join selection yetNot very good at executing very big joins
• No generic YARN / Tez execution layer• But Hive / Spark support for generic DAG execution
Page 21 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Secondary Indexes
HBase table is a sorted map• Everything in HBase is sorted in primary key order• Full or partial scans in sort order is very efficient in HBase• Sort data differently with secondary index dimensions
Two types• Global index• Local index
Query• Indexes are “covered”• Indexes are automatically selected from queries• Only covered columns are returned from index without going back to data table
Page 22 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Global and Local Index
Global Index• A single instance for all table data in a
different sort order• A different HBase table per index• Optimized for read-heavy use cases• Can be one edit “behind” actual primary
data• Transactional tables indices have ACID
guarantees• Different consistency / durability for
mutable / immutable tables
Local Index• Multiple mini-instances per region
• Uses same HBase table, different cf• Optimized for write-heavy use cases• Atomic commit and visibility (coming soon)• Queries have to ask all regions for
relevant data from index
Page 23 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Part II – The PresentAll the recent stuff !
Page 24 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Release Note Highlights
4.4• Functional Indexes• UDFs• Query Server• UNION ALL• MR Index Build• Spark Integration• Date built-in functions
4.5• Client-side per-statement metrics• SELECT without FROM• ALTER TABLE with VIEWS• Math and Array built-in functions
Page 25 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Release Note Highlights
4.6• ROW_TIMESTAMP for HBase native timestamps• Support for correlate variable• Support for un-nesting arrays• Web-app for visualizing trace info (alpha)
4.7 • Transaction support• Enhanced secondary index consistency guarantees• Statistics improvements• Perf improvements
Page 26 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Row Timestamps
A pseudo-column for HBase native timestamps (versions)Enables setting and querying cell timestamps Perfect for time-series use cases
• Combine with FIFO / Date Tiered Compaction policies• And HBase scan file pruning based on min-max ts for very efficient scans
CREATE TABLE METRICS_TABLE ( CREATED_DATE NOT NULL DATE, METRIC_ID NOT NULL CHAR(15), METRIC_VALUE LONG CONSTRAINT PK PRIMARY KEY(CREATED_DATE ROW_TIMESTAMP, METRIC_ID)) SALT_BUCKETS = 8;
Page 27 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Transactions
Uses TephraSnapshot isolation semanticsCompletely optional.
• Can be enabled per-table (TRANSACTIONAL=true)• Transactional and non-transactional tables can live side by side
Transactions see their own uncommitted dataReleased in 4.7, will GA in 5.0Optimistic Concurrency Control
• No locking for rows• Transactions have to roll back and undo their writes in case of conflict• Cost of conflict is higher
Page 28 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Tephra Architecture
RegionServer 2
Tephra / HBase Client
RegionServer 1 RegionServer 3
HBase client
ZooKeeper Quorum
Tephra Trx Manager(active)
Tephra Trx Manager(standby)
Page 29 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Transaction Lifecycle
From Tephra presentation http://www.slideshare.net/alexbaranau/transactions-over-hbase
Page 30 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Phoenix Query Server
Similar to HBase REST Server / Hive Server 2Built on top of Calcite’s Avatica Server with Phoenix bindingsEmbeds a Phoenix thick client insideNo client side sorting / join! Protobuf-3.0 over HTTP protocolHas a (thin) JDBC driver Allows ODBC driver for Phoenix
Page 31 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Phoenix architecture revisited (thick client)
RegionServer 2
T:foo, region:d
Phoenix RPC endpoint
px
Application
RegionServer 1
T:foo, region:d
Phoenix RPC endpoint
px
RegionServer 3
T:foo, region:d
Phoenix RPC endpoint
px
HBase client
Phoenix client / JDBC
Page 32 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Phoenix Query Server
Phoenix Query Server (thin client)
RegionServer 2
T:foo, region:d
Phoenix RPC endpoint
px
Application
Phoenix thin client / JDBC
RegionServer 1
T:foo, region:d
Phoenix RPC endpoint
px
RegionServer 3
T:foo, region:d
Phoenix RPC endpoint
px
Phoenix client / JDBC
HBase client
Phoenix Query Server
Phoenix client / JDBC
HBase client
Phoenix Query Server
Phoenix client / JDBC
HBase client
Page 33 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Other new features (4.8+)
Shaded client by default. No more library dependency problems! Phoenix schema mapping to HBase namespace
• Allows using isolation and security features of HBase namespaces• Standard SQL syntax:
CREATE SCHEMA FOO; USE FOO;
LIMIT / OFFSET• We already had LIMIT. Now we have OFFSET• Together with Row-Value-Constructs, covers most of cursor use cases
Page 34 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Part III – The FutureAll the upcoming stuff !
Page 35 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Local Index
• Local Index re-implemented• Instead of a different table, now local index data is kept within the same data
table• Local index data goes into a different column family• Index and data is committed together atomically without external transactions• Bunch of stability improvements with region splits and merges
Page 36 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Calcite Integration
Calcite is a framework for:• Query parser• Compiler• Planner• Cost based optimizer
SQL-92 compliantBased on relational algebraCost based optimizer with default rules + pluggable rules per-backendUsed by Hive / Drill / Kylin / Samza, etc.
Page 37 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Calcite Integration
Page 38 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Phoenix - Hive integration
Hive is a very rich and generic execution engineUses Tez + YARN to execute arbitrary DAGHive integration enables big joins and other Hive featuresPhoenix DDL with HiveQLData insert / update delete (DML) with HiveQLPredicate pushdown, salting, partitioning, partition pruning, etc Can use secondary indexes as well since it uses Phoenix compilerhttps://issues.apache.org/jira/browse/PHOENIX-2743
Page 39 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Future<Phoenix>
JSON supportTPC-H / Microstrategy / Tableau queriesSqoop integrationSupport Omid based transactionsDogfooding within the Hadoop-ecosystem
• Ambari Metrics Service (AMS) uses Phoenix • YARN will soon use HBase / Phoenix (ATS)
STRUCT typeImprovements to cost based optimizationSecurity and other HBase features used from PhoenixSee https://phoenix.apache.org/roadmap.html
Page 40 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Further Reference
Even more info on https://phoenix.apache.org New Features: https://phoenix.apache.org/recent.html Roadmap: https://phoenix.apache.org/roadmap.html
Get involved in mailing lists [email protected] [email protected]
Page 41 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
ThanksQ & A
Top Related