HBase and Impala Notes - Munich HUG - 20131017
-
Upload
larsgeorge -
Category
Technology
-
view
113 -
download
0
description
Transcript of HBase and Impala Notes - Munich HUG - 20131017
1
HBase and Impala Use Cases for fast SQL queries
2
About Me
• EMEA Chief Architect @ Cloudera (3+ years) • ConsulGng on Hadoop projects (everywhere)
• Apache CommiLer • HBase and Whirr
• O’Reilly Author • HBase – The DefiniGve Guide
• Now in Japanese!
• Contact • [email protected] • @larsgeorge
日本語版も出ました!
3
Agenda
• “IntroducGon” to HBase • Impala Architecture • Mapping Schemas • Query ConsideraGon
4
Intro To HBase Slide 4 to 250
5
What is HBase?
This is HBase!
HBase
6
What is HBase?
This is HBase!
HBase Really though… RTFM! (there are at least two good books about it)
7
IOPS vs Throughput Mythbusters
It is all physics in the end, you cannot solve an I/O problem without reducing I/O in general. Parallelize access and read/write sequenGally.
8
HBase: Strengths & Weaknesses
Strengths: • Random access to small(ish) key-‐value pairs • Rows and columns stored sorted lexicographically • Adds table and region concepts to group related KVs • Stores and reads data sequenGally • Parallelizes across all clients
• Non-‐blocking I/O throughout
9
Using HBase Strengths
10
HBase “Indexes”
• Use primary keys, aka the row keys, as sorted index • One sort direcGon only • Use “secondary index” to get reverse sorGng
• Lookup table or same table
• Use secondary keys, aka the column qualifiers, as sorted index within main record • Use prefixes within a column family or separate column families
11
HBase: Strengths & Weaknesses
Weaknesses: • Not opGmized (yet) for 100% possible throughput of underlying storage layer • And HDFS is not opGmized fully either
• Single writer issue with WALs • Single server hot-‐sporng with non-‐distributed keys
12
HBase Dilemma
Although HBase can host many applicaGons, they may require completely opposite features
Events En((es
Time Series Message Store
13
Opposite Use-‐Case
• EnGty Store • Regular (random) updates and inserts in exisGng enGty • Causes enGty details being spread over many files • Needs to read a lot of data to reconsGtute “logical” view • WriGng is osen nicely distributed (can be hashed)
• Event Store • One-‐off inserts of events such as log entries • Access is osen a scan over parGGons by Gme • Reads are efficient due to sequenGal write paLern • Writes need to be taken care of to avoid hotsporng
14
Impala Architecture
15
Beyond Batch
For some things MapReduce is just too slow
Apache Hive: • MapReduce execuGon engine • High-‐latency, low throughput • High runGme overhead
Google realized this early on • Analysts wanted fast, interacGve results
15
16
Dremel
Google paper (2010) “scalable, interac.ve ad-‐hoc query system for analysis of read-‐only nested data”
Columnar storage format
Distributed scalable aggregaGon “capable of running aggrega.on queries over trillion-‐row tables in seconds”
16
hLp://research.google.com/pubs/pub36632.html
17
Impala: Goals
• General-‐purpose SQL query engine for Hadoop • For analyGcal and transacGonal workloads • Support queries that take ms to hours • Run directly with Hadoop
• Collocated daemons • Same file formats • Same storage managers (NN, metastore)
17
18
Impala: Goals
• High performance • C++ • runGme code generaGon (LLVM) • direct access to data (no MapReduce)
• Retain user experience • easy for Hive users to migrate
• 100% open-‐source
18
19
Impala: Architecture
• impalad • runs on every node • handles client requests (ODBC, thris) • handles query planning & execuGon
• statestored • provides name service • metadata distribuGon • used for finding data
19
20
Impala: Architecture
20
21
Impala: Architecture
21
22
Impala: Architecture
22
23
Impala: Architecture
23
24
Mapping Schemas HBase to Typed Schema
25
Binary to Types
• HBase only has binary keys and values • Hive and Impala share the same metastore which adds types to each column • Can use Hive or Impala shell to change metadata
• The row key of an HBase table is mapped to a column in the metastore, i.e. on the SQL side • Impala prefers “String” type to beLer support comparisons and sorGng
26
Defining the Schema
CREATE TABLE hbase_table_1(
key string, value string )
STORED BY
"org.apache.hadoop.hive.hbase.HBaseStorageHandler"
WITH SERDEPROPERTIES(
"hbase.columns.mapping" = ":key,cf1:val" )
TBLPROPERTIES (
"hbase.table.name" = "xyz"
);
27
Defining the Schema
CREATE TABLE hbase_table_1(
key string, value string )
STORED BY
"org.apache.hadoop.hive.hbase.HBaseStorageHandler"
WITH SERDEPROPERTIES(
"hbase.columns.mapping" = ":key,cf1:val" )
TBLPROPERTIES (
"hbase.table.name" = "xyz"
);
Maps columns to fields
28
Mapping OpGons
• Can create a new table or map to an exis(ng one • CREATE TABLE vs. • CREATE EXTERNAL TABLE
• CreaGng table through Hive or Impala does not set any table or column family proper(es • Typically not a good idea to rely on defaults • BeLer specify compression, TTLs, etc. on HBase side and then map as external table
29
Mapping OpGons
SERDE ProperGes to map columns to fields • hbase.columns.mapping
• Matching count of entries required (on SQL side only) • Spaces are not allowed (as they are valid characters in HBase) • The “:key” mapping is a special one for the HBase row key • Otherwise: column-family-name:[column-name][#(binary|string)
• hbase.table.default.storage.type • Can be string (the default) or binary • Defines the default type • Binary means data treated like HBase Bytes class does
30
Mapping Limits
• Only one (1) “:key” is allowed • But can be inserted in SQL schema at will
• Access to HBase KV versions are not supported (yet) • Always returns the latest version by default • This is very similar to what a database user expects
• HBase columns not mapped are not visible on SQL side • Since row keys in HBase are unique, results may vary
• InserGng duplicate keys updates row while count of rows stays the same
• INSERT OVERWRITE does not delete exisGng rows but rather updates those (HBase is mutable aser all!)
31
Query ConsideraGons
32
HBase Table Scan
$ hbase shell hbase(main):001:0> list xyz 1 row(s) in 0.0530 seconds' hbase(main):002:0> describe "xyz" DESCRIPTION ENABLED {NAME => 'xyz', FAMILIES => [{NAME => 'cf1', COMPRESSION => 'NONE', VE true RSIONS => '3', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}]} 1 row(s) in 0.0220 seconds hbase(main):003:0> scan "xyz" ROW COLUMN+CELL 0 row(s) in 0.0060 seconds
Table was created
Table empty
33
HBase Table Scan
Insert data from exisGng table into HBase backed one:
INSERT OVERWRITE TABLE hbase_table_1 \
SELECT * FROM pokes WHERE foo=98;
Verify on HBase side:
hbase(main):009:0> scan "xyz" ROW COLUMN+CELL 98 column=cf1:val, timestamp=1267737987733, value=val_98 1 row(s) in 0.0110 seconds
34
Pro Tip: hLp://gethue.com/
35
HBase Scans under the Hood
Impala uses Scan instances under the hood just as the naGve Java API does. This allows for all scan opGmizaGons, e.g. predicate push-‐down, like
• Start and Stop Row • Server-‐side Filters • Scanner caching (but not batching yet)
36
Configure HBase Scan Details
In impala-shell:
• Same as calling setCacheBlocks(true) or setCacheBlocks(false)
set hbase_cache_blocks=true;
set hbase_cache_blocks=false;
• Same as calling setCaching(rows)
set hbase_caching=1000;
37
HBase Scans under the Hood
Back to Physics: A scan can only perform well if as few data is read as possible. • Need to issue queries that are known not to be full table scans
• This requires careful schema design!
Typical use-‐cases are • OLAP cube: read report data from single row • Time series: read fine-‐grained, Gme parGGoned data
38
OLAP Example
• Facebook Insights is using HBase to keep an OLAP cube live, i.e. fully materialized
• Each row reflect one tracked page and contains all its data points • All dimensions with Gme bracket prefix plus TTLs
• During report Gme only one or very few rows are read
• Design favors read over write performance • Could also think about hybrid system:
• CEP + HBase + HDFS (Parquet)
39
Time Series Example
• OpenTSDB writes the metric events bucketed by metric ID and then Gmestamp • Helps using all servers in the cluster equally
• During reporGng/dashboarding the data is read for specific metrics within a specific (me frame
• Sorted data translates into effec(ve use of Scan with start and stop rows
40
Final Notes
Since the HBase scan performance is mainly influenced by number of rows scanned you need to issue queries that are selecGve, i.e. scan only certain rows and not the en(re table.
This requires WHERE clauses with the HBase row key in it:
SELECT f1, f2, f3 FROM mapped_table
WHERE key >= "user1234" AND key < "user1235";
“Scan all rows for user 1234, i.e. that have a row key starGng with user1234” -‐ might be a composite key!
41
Example
42
Final Notes
Not using the primary HBase index, aka row key, results in a full table scan and might need much longer (when you have a large table. SELECT f1, f2, f3 FROM mapped_table
WHERE f1 = ”value1” OR f20 < ”200";
This will result in a full table scan. Remember: it is all just physics!
43
Final Notes
Impala also uses SingleColumnValueFilter from HBase to reduce transferred data • Filters out enGre rows by checking a given column value
• Does not skip rows since no index or Bloom filter is available to help idenGfy the next match
Overall this helps yet cannot do any magic (physics again!)
44
Final Notes
Some advice on Tall-‐narrow vs. flat-‐wide table layout: Store data in a tall and narrow table since there is currently no support for scanner batching (i.e. intra row scanning). Mapping, for example, one million HBase columns into SQL is fu(le. This is sGll true for Hive’s Map support, since the enGre row has to fit into memory!
45
Outlook
Future work: • Composite keys: map mul(ple SQL fields into a single composite HBase row key
• Expose KV versions to SQL schema • BeLer predicate pushdown
• Advanced filter or indexes?