HBase and Impala Notes - Munich HUG - 20131017

1

HBase and Impala Use Cases for fast SQL queries

2

About Me

•  EMEA Chief Architect @ Cloudera (3+ years) •  ConsulGng on Hadoop projects (everywhere)

•  Apache CommiLer •  HBase and Whirr

•  O’Reilly Author •  HBase – The DefiniGve Guide

•  Now in Japanese!

•  Contact •  [email protected] •  @larsgeorge

日本語版も出ました!

3

Agenda

•  “IntroducGon” to HBase •  Impala Architecture • Mapping Schemas •  Query ConsideraGon

4

Intro To HBase Slide 4 to 250

5

What is HBase?

This is HBase!

HBase

6

What is HBase?

This is HBase!

HBase Really though… RTFM! (there are at least two good books about it)

7

IOPS vs Throughput Mythbusters

It is all physics in the end, you cannot solve an I/O problem without reducing I/O in general. Parallelize access and read/write sequenGally.

8

HBase: Strengths & Weaknesses

Strengths: •  Random access to small(ish) key-‐value pairs •  Rows and columns stored sorted lexicographically •  Adds table and region concepts to group related KVs •  Stores and reads data sequenGally •  Parallelizes across all clients

•  Non-‐blocking I/O throughout

9

Using HBase Strengths

10

HBase “Indexes”

•  Use primary keys, aka the row keys, as sorted index •  One sort direcGon only •  Use “secondary index” to get reverse sorGng

•  Lookup table or same table

•  Use secondary keys, aka the column qualifiers, as sorted index within main record •  Use prefixes within a column family or separate column families

11

HBase: Strengths & Weaknesses

Weaknesses: •  Not opGmized (yet) for 100% possible throughput of underlying storage layer •  And HDFS is not opGmized fully either

•  Single writer issue with WALs •  Single server hot-‐sporng with non-‐distributed keys

12

HBase Dilemma

Although HBase can host many applicaGons, they may require completely opposite features

Events En((es

Time Series Message Store

13

Opposite Use-‐Case

•  EnGty Store •  Regular (random) updates and inserts in exisGng enGty •  Causes enGty details being spread over many files •  Needs to read a lot of data to reconsGtute “logical” view •  WriGng is osen nicely distributed (can be hashed)

•  Event Store •  One-‐off inserts of events such as log entries •  Access is osen a scan over parGGons by Gme •  Reads are efficient due to sequenGal write paLern •  Writes need to be taken care of to avoid hotsporng

14

Impala Architecture

15

Beyond Batch

For some things MapReduce is just too slow

Apache Hive: •  MapReduce execuGon engine •  High-‐latency, low throughput •  High runGme overhead

Google realized this early on •  Analysts wanted fast, interacGve results

15

16

Dremel

Google paper (2010) “scalable, interac.ve ad-‐hoc query system for analysis of read-‐only nested data”

Columnar storage format

Distributed scalable aggregaGon “capable of running aggrega.on queries over trillion-‐row tables in seconds”

16

hLp://research.google.com/pubs/pub36632.html

17

Impala: Goals

•  General-‐purpose SQL query engine for Hadoop •  For analyGcal and transacGonal workloads •  Support queries that take ms to hours •  Run directly with Hadoop

•  Collocated daemons •  Same file formats •  Same storage managers (NN, metastore)

17

18

Impala: Goals

•  High performance •  C++ •  runGme code generaGon (LLVM) •  direct access to data (no MapReduce)

•  Retain user experience •  easy for Hive users to migrate

•  100% open-‐source

18

19

Impala: Architecture

•  impalad •  runs on every node •  handles client requests (ODBC, thris) •  handles query planning & execuGon

•  statestored •  provides name service •  metadata distribuGon •  used for finding data

19

20


20

21


21

22


22

23


23

24

Mapping Schemas HBase to Typed Schema

25

Binary to Types

•  HBase only has binary keys and values •  Hive and Impala share the same metastore which adds types to each column •  Can use Hive or Impala shell to change metadata

•  The row key of an HBase table is mapped to a column in the metastore, i.e. on the SQL side •  Impala prefers “String” type to beLer support comparisons and sorGng

26

Defining the Schema

CREATE TABLE hbase_table_1(

key string, value string )

STORED BY

"org.apache.hadoop.hive.hbase.HBaseStorageHandler"

WITH SERDEPROPERTIES(

"hbase.columns.mapping" = ":key,cf1:val" )

TBLPROPERTIES (

"hbase.table.name" = "xyz"

);

27

Defining the Schema

CREATE TABLE hbase_table_1(

key string, value string )

STORED BY

"org.apache.hadoop.hive.hbase.HBaseStorageHandler"

WITH SERDEPROPERTIES(

"hbase.columns.mapping" = ":key,cf1:val" )

TBLPROPERTIES (

"hbase.table.name" = "xyz"

);

Maps columns to fields

28

Mapping OpGons

•  Can create a new table or map to an exis(ng one •  CREATE TABLE vs. •  CREATE EXTERNAL TABLE

•  CreaGng table through Hive or Impala does not set any table or column family proper(es •  Typically not a good idea to rely on defaults •  BeLer specify compression, TTLs, etc. on HBase side and then map as external table

29

Mapping OpGons

SERDE ProperGes to map columns to fields •  hbase.columns.mapping

•  Matching count of entries required (on SQL side only) •  Spaces are not allowed (as they are valid characters in HBase) •  The “:key” mapping is a special one for the HBase row key •  Otherwise: column-family-name:[column-name][#(binary|string)

•  hbase.table.default.storage.type •  Can be string (the default) or binary •  Defines the default type •  Binary means data treated like HBase Bytes class does

30

Mapping Limits

•  Only one (1) “:key” is allowed •  But can be inserted in SQL schema at will

•  Access to HBase KV versions are not supported (yet) •  Always returns the latest version by default •  This is very similar to what a database user expects

•  HBase columns not mapped are not visible on SQL side •  Since row keys in HBase are unique, results may vary

•  InserGng duplicate keys updates row while count of rows stays the same

•  INSERT OVERWRITE does not delete exisGng rows but rather updates those (HBase is mutable aser all!)

31

Query ConsideraGons

32

HBase Table Scan

$ hbase shell hbase(main):001:0> list xyz 1 row(s) in 0.0530 seconds' hbase(main):002:0> describe "xyz" DESCRIPTION ENABLED {NAME => 'xyz', FAMILIES => [{NAME => 'cf1', COMPRESSION => 'NONE', VE true RSIONS => '3', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}]} 1 row(s) in 0.0220 seconds hbase(main):003:0> scan "xyz" ROW COLUMN+CELL 0 row(s) in 0.0060 seconds

Table was created

Table empty

33

HBase Table Scan

Insert data from exisGng table into HBase backed one:

INSERT OVERWRITE TABLE hbase_table_1 \

SELECT * FROM pokes WHERE foo=98;

Verify on HBase side:

hbase(main):009:0> scan "xyz" ROW COLUMN+CELL 98 column=cf1:val, timestamp=1267737987733, value=val_98 1 row(s) in 0.0110 seconds

34

Pro Tip: hLp://gethue.com/

35

HBase Scans under the Hood

Impala uses Scan instances under the hood just as the naGve Java API does. This allows for all scan opGmizaGons, e.g. predicate push-‐down, like

•  Start and Stop Row •  Server-‐side Filters •  Scanner caching (but not batching yet)

36

Configure HBase Scan Details

In impala-shell:

•  Same as calling setCacheBlocks(true) or setCacheBlocks(false)

set hbase_cache_blocks=true;

set hbase_cache_blocks=false;

•  Same as calling setCaching(rows)

set hbase_caching=1000;

37

HBase Scans under the Hood

Back to Physics: A scan can only perform well if as few data is read as possible. •  Need to issue queries that are known not to be full table scans

•  This requires careful schema design!

Typical use-‐cases are •  OLAP cube: read report data from single row •  Time series: read fine-‐grained, Gme parGGoned data

38

OLAP Example

•  Facebook Insights is using HBase to keep an OLAP cube live, i.e. fully materialized

•  Each row reflect one tracked page and contains all its data points •  All dimensions with Gme bracket prefix plus TTLs

•  During report Gme only one or very few rows are read

•  Design favors read over write performance •  Could also think about hybrid system:

•  CEP + HBase + HDFS (Parquet)

39

Time Series Example

•  OpenTSDB writes the metric events bucketed by metric ID and then Gmestamp •  Helps using all servers in the cluster equally

•  During reporGng/dashboarding the data is read for specific metrics within a specific (me frame

•  Sorted data translates into effec(ve use of Scan with start and stop rows

40

Final Notes

Since the HBase scan performance is mainly influenced by number of rows scanned you need to issue queries that are selecGve, i.e. scan only certain rows and not the en(re table.

This requires WHERE clauses with the HBase row key in it:

SELECT f1, f2, f3 FROM mapped_table

WHERE key >= "user1234" AND key < "user1235";

“Scan all rows for user 1234, i.e. that have a row key starGng with user1234” -‐ might be a composite key!

41

Example

42

Final Notes

Not using the primary HBase index, aka row key, results in a full table scan and might need much longer (when you have a large table. SELECT f1, f2, f3 FROM mapped_table

WHERE f1 = ”value1” OR f20 < ”200";

This will result in a full table scan. Remember: it is all just physics!

43

Final Notes

Impala also uses SingleColumnValueFilter from HBase to reduce transferred data •  Filters out enGre rows by checking a given column value

•  Does not skip rows since no index or Bloom filter is available to help idenGfy the next match

Overall this helps yet cannot do any magic (physics again!)

44

Final Notes

Some advice on Tall-‐narrow vs. flat-‐wide table layout: Store data in a tall and narrow table since there is currently no support for scanner batching (i.e. intra row scanning). Mapping, for example, one million HBase columns into SQL is fu(le. This is sGll true for Hive’s Map support, since the enGre row has to fit into memory!

45

Outlook

Future work: •  Composite keys: map mul(ple SQL fields into a single composite HBase row key

•  Expose KV versions to SQL schema •  BeLer predicate pushdown

•  Advanced filter or indexes?

46

Ques(ons?

@larsgeorge [email protected]

HBase and Impala Notes - Munich HUG - 20131017

Technology

Transcript of HBase and Impala Notes - Munich HUG - 20131017