HBase and Impala Notes - Munich HUG - 20131017

46
1 HBase and Impala Use Cases for fast SQL queries

description

Talk given during the Munich HUG meetup, 10/17/2013, about how HBase and Impala work together and caveats to watch out for.

Transcript of HBase and Impala Notes - Munich HUG - 20131017

Page 1: HBase and Impala Notes - Munich HUG - 20131017

1  

HBase  and  Impala  Use  Cases  for  fast  SQL  queries  

Page 2: HBase and Impala Notes - Munich HUG - 20131017

2  

About  Me  

•  EMEA  Chief  Architect  @  Cloudera  (3+  years)  •  ConsulGng  on  Hadoop  projects  (everywhere)  

•  Apache  CommiLer  •  HBase  and  Whirr  

•  O’Reilly  Author  •  HBase  –  The  DefiniGve  Guide  

•  Now  in  Japanese!  

•  Contact  •  [email protected]  •  @larsgeorge  

日本語版も出ました!  

Page 3: HBase and Impala Notes - Munich HUG - 20131017

3  

Agenda  

•  “IntroducGon”  to  HBase  •  Impala  Architecture  • Mapping  Schemas  •  Query  ConsideraGon  

Page 4: HBase and Impala Notes - Munich HUG - 20131017

4  

Intro  To  HBase  Slide  4  to  250  

Page 5: HBase and Impala Notes - Munich HUG - 20131017

5  

What  is  HBase?  

This  is  HBase!  

HBase  

Page 6: HBase and Impala Notes - Munich HUG - 20131017

6  

What  is  HBase?  

This  is  HBase!  

HBase  Really  though…  RTFM!  (there  are  at  least  two  good  books  about  it)  

Page 7: HBase and Impala Notes - Munich HUG - 20131017

7  

IOPS  vs  Throughput  Mythbusters  

It  is  all  physics  in  the  end,  you  cannot  solve  an  I/O  problem  without  reducing  I/O  in  general.  Parallelize  access  and  read/write  sequenGally.  

Page 8: HBase and Impala Notes - Munich HUG - 20131017

8  

HBase:  Strengths  &  Weaknesses  

Strengths:  •  Random  access  to  small(ish)  key-­‐value  pairs  •  Rows  and  columns  stored  sorted  lexicographically    •  Adds  table  and  region  concepts  to  group  related  KVs  •  Stores  and  reads  data  sequenGally  •  Parallelizes  across  all  clients  

•  Non-­‐blocking  I/O  throughout  

Page 9: HBase and Impala Notes - Munich HUG - 20131017

9  

Using  HBase  Strengths  

Page 10: HBase and Impala Notes - Munich HUG - 20131017

10  

HBase  “Indexes”  

•  Use  primary  keys,  aka  the  row  keys,  as  sorted  index  •  One  sort  direcGon  only  •  Use  “secondary  index”  to  get  reverse  sorGng  

•  Lookup  table  or  same  table  

•  Use  secondary  keys,  aka  the  column  qualifiers,  as  sorted  index  within  main  record  •  Use  prefixes  within  a  column  family  or  separate  column  families    

Page 11: HBase and Impala Notes - Munich HUG - 20131017

11  

HBase:  Strengths  &  Weaknesses  

Weaknesses:  •  Not  opGmized  (yet)  for  100%  possible  throughput  of  underlying  storage  layer  •  And  HDFS  is  not  opGmized  fully  either  

•  Single  writer  issue  with  WALs  •  Single  server  hot-­‐sporng  with  non-­‐distributed  keys  

Page 12: HBase and Impala Notes - Munich HUG - 20131017

12  

HBase  Dilemma  

Although  HBase  can  host  many  applicaGons,  they  may  require  completely  opposite  features  

Events   En((es  

Time  Series   Message  Store  

Page 13: HBase and Impala Notes - Munich HUG - 20131017

13  

Opposite  Use-­‐Case  

•  EnGty  Store  •  Regular  (random)  updates  and  inserts  in  exisGng  enGty  •  Causes  enGty  details  being  spread  over  many  files  •  Needs  to  read  a  lot  of  data  to  reconsGtute  “logical”  view  •  WriGng  is  osen  nicely  distributed  (can  be  hashed)  

•  Event  Store  •  One-­‐off  inserts  of  events  such  as  log  entries  •  Access  is  osen  a  scan  over  parGGons  by  Gme  •  Reads  are  efficient  due  to  sequenGal  write  paLern  •  Writes  need  to  be  taken  care  of  to  avoid  hotsporng  

Page 14: HBase and Impala Notes - Munich HUG - 20131017

14  

Impala  Architecture  

Page 15: HBase and Impala Notes - Munich HUG - 20131017

15  

Beyond  Batch  

For  some  things  MapReduce  is  just  too  slow  

Apache  Hive:  •  MapReduce  execuGon  engine  •  High-­‐latency,  low  throughput  •  High  runGme  overhead  

Google  realized  this  early  on  •  Analysts  wanted  fast,  interacGve  results  

15  

Page 16: HBase and Impala Notes - Munich HUG - 20131017

16  

Dremel  

Google  paper  (2010)  “scalable,  interac.ve  ad-­‐hoc  query  system  for  analysis  of  read-­‐only  nested  data”  

Columnar  storage  format  

Distributed  scalable  aggregaGon  “capable  of  running  aggrega.on  queries  over  trillion-­‐row  tables  in  seconds”  

16  

hLp://research.google.com/pubs/pub36632.html  

Page 17: HBase and Impala Notes - Munich HUG - 20131017

17  

Impala:  Goals  

•  General-­‐purpose  SQL  query  engine  for  Hadoop  •  For  analyGcal  and  transacGonal  workloads  •  Support  queries  that  take  ms  to  hours  •  Run  directly  with  Hadoop  

•  Collocated  daemons  •  Same  file  formats  •  Same  storage  managers  (NN,  metastore)  

17  

Page 18: HBase and Impala Notes - Munich HUG - 20131017

18  

Impala:  Goals  

•  High  performance  •  C++  •  runGme  code  generaGon  (LLVM)  •  direct  access  to  data  (no  MapReduce)  

•  Retain  user  experience  •  easy  for  Hive  users  to  migrate  

•  100%  open-­‐source  

18  

Page 19: HBase and Impala Notes - Munich HUG - 20131017

19  

Impala:  Architecture  

•  impalad  •  runs  on  every  node  •  handles  client  requests  (ODBC,  thris)  •  handles  query  planning  &  execuGon  

•  statestored  •  provides  name  service  •  metadata  distribuGon  •  used  for  finding  data  

19  

Page 20: HBase and Impala Notes - Munich HUG - 20131017

20  

Impala:  Architecture  

20  

Page 21: HBase and Impala Notes - Munich HUG - 20131017

21  

Impala:  Architecture  

21  

Page 22: HBase and Impala Notes - Munich HUG - 20131017

22  

Impala:  Architecture  

22  

Page 23: HBase and Impala Notes - Munich HUG - 20131017

23  

Impala:  Architecture  

23  

Page 24: HBase and Impala Notes - Munich HUG - 20131017

24  

Mapping  Schemas  HBase  to  Typed  Schema  

Page 25: HBase and Impala Notes - Munich HUG - 20131017

25  

Binary  to  Types  

•  HBase  only  has  binary  keys  and  values  •  Hive  and  Impala  share  the  same  metastore  which  adds  types  to  each  column  •  Can  use  Hive  or  Impala  shell  to  change  metadata  

•  The  row  key  of  an  HBase  table  is  mapped  to  a  column  in  the  metastore,  i.e.  on  the  SQL  side    •  Impala  prefers  “String”  type  to  beLer  support  comparisons  and  sorGng  

Page 26: HBase and Impala Notes - Munich HUG - 20131017

26  

Defining  the  Schema  

CREATE TABLE hbase_table_1(

key string, value string )

STORED BY

"org.apache.hadoop.hive.hbase.HBaseStorageHandler"

WITH SERDEPROPERTIES(

"hbase.columns.mapping" = ":key,cf1:val" )

TBLPROPERTIES (

"hbase.table.name" = "xyz"

);

Page 27: HBase and Impala Notes - Munich HUG - 20131017

27  

Defining  the  Schema  

CREATE TABLE hbase_table_1(

key string, value string )

STORED BY

"org.apache.hadoop.hive.hbase.HBaseStorageHandler"

WITH SERDEPROPERTIES(

"hbase.columns.mapping" = ":key,cf1:val" )

TBLPROPERTIES (

"hbase.table.name" = "xyz"

);

Maps  columns  to  fields  

Page 28: HBase and Impala Notes - Munich HUG - 20131017

28  

Mapping  OpGons  

•  Can  create  a  new  table  or  map  to  an  exis(ng  one  •  CREATE TABLE    vs.  •  CREATE EXTERNAL TABLE

•  CreaGng  table  through  Hive  or  Impala  does  not  set  any  table  or  column  family  proper(es  •  Typically  not  a  good  idea  to  rely  on  defaults  •  BeLer  specify  compression,  TTLs,  etc.  on  HBase  side  and  then  map  as  external  table  

   

Page 29: HBase and Impala Notes - Munich HUG - 20131017

29  

Mapping  OpGons  

SERDE  ProperGes  to  map  columns  to  fields  •  hbase.columns.mapping

•  Matching  count  of  entries  required  (on  SQL  side  only)  •  Spaces  are  not  allowed  (as  they  are  valid  characters  in  HBase)  •  The  “:key”  mapping  is  a  special  one  for  the  HBase  row  key  •  Otherwise:  column-family-name:[column-name][#(binary|string)

•  hbase.table.default.storage.type •  Can  be  string  (the  default)  or  binary •  Defines  the  default  type  •  Binary  means  data  treated  like  HBase  Bytes  class  does    

Page 30: HBase and Impala Notes - Munich HUG - 20131017

30  

Mapping  Limits  

•  Only  one  (1)  “:key”  is  allowed  •  But  can  be  inserted  in  SQL  schema  at  will  

•  Access  to  HBase  KV  versions  are  not  supported  (yet)  •  Always  returns  the  latest  version  by  default  •  This  is  very  similar  to  what  a  database  user  expects  

•  HBase  columns  not  mapped  are  not  visible  on  SQL  side  •  Since  row  keys  in  HBase  are  unique,  results  may  vary  

•  InserGng  duplicate  keys  updates  row  while  count  of  rows  stays  the  same  

•  INSERT  OVERWRITE  does  not  delete  exisGng  rows  but  rather  updates  those  (HBase  is  mutable  aser  all!)  

Page 31: HBase and Impala Notes - Munich HUG - 20131017

31  

Query  ConsideraGons  

Page 32: HBase and Impala Notes - Munich HUG - 20131017

32  

HBase  Table  Scan  

$ hbase shell hbase(main):001:0> list xyz 1 row(s) in 0.0530 seconds' hbase(main):002:0> describe "xyz" DESCRIPTION ENABLED {NAME => 'xyz', FAMILIES => [{NAME => 'cf1', COMPRESSION => 'NONE', VE true RSIONS => '3', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}]} 1 row(s) in 0.0220 seconds hbase(main):003:0> scan "xyz" ROW COLUMN+CELL 0 row(s) in 0.0060 seconds

Table  was  created  

Table  empty  

Page 33: HBase and Impala Notes - Munich HUG - 20131017

33  

HBase  Table  Scan  

Insert  data  from  exisGng  table  into  HBase  backed  one:  

INSERT OVERWRITE TABLE hbase_table_1 \

SELECT * FROM pokes WHERE foo=98;

 Verify  on  HBase  side:  

hbase(main):009:0> scan "xyz" ROW COLUMN+CELL 98 column=cf1:val, timestamp=1267737987733, value=val_98 1 row(s) in 0.0110 seconds

Page 34: HBase and Impala Notes - Munich HUG - 20131017

34  

Pro  Tip:  hLp://gethue.com/  

Page 35: HBase and Impala Notes - Munich HUG - 20131017

35  

HBase  Scans  under  the  Hood  

Impala  uses  Scan  instances  under  the  hood  just  as  the  naGve  Java  API  does.  This  allows  for  all  scan  opGmizaGons,  e.g.  predicate  push-­‐down,  like  

•  Start  and  Stop  Row  •  Server-­‐side  Filters  •  Scanner  caching  (but  not  batching  yet)  

Page 36: HBase and Impala Notes - Munich HUG - 20131017

36  

Configure  HBase  Scan  Details  

In  impala-shell:    

•  Same  as  calling  setCacheBlocks(true)  or  setCacheBlocks(false)

set hbase_cache_blocks=true;

set hbase_cache_blocks=false;

•  Same  as  calling  setCaching(rows)  

set hbase_caching=1000;

Page 37: HBase and Impala Notes - Munich HUG - 20131017

37  

HBase  Scans  under  the  Hood  

Back  to  Physics:  A  scan  can  only  perform  well  if  as  few  data  is  read  as  possible.  •  Need  to  issue  queries  that  are  known  not  to  be  full  table  scans  

•  This  requires  careful  schema  design!  

Typical  use-­‐cases  are    •  OLAP  cube:  read  report  data  from  single  row  •  Time  series:  read  fine-­‐grained,  Gme  parGGoned  data  

Page 38: HBase and Impala Notes - Munich HUG - 20131017

38  

OLAP  Example  

•  Facebook  Insights  is  using  HBase  to  keep  an  OLAP  cube  live,  i.e.  fully  materialized  

•  Each  row  reflect  one  tracked  page  and  contains  all  its  data  points  •  All  dimensions  with  Gme  bracket  prefix  plus  TTLs  

•  During  report  Gme  only  one  or  very  few  rows  are  read  

•  Design  favors  read  over  write  performance  •  Could  also  think  about  hybrid  system:  

•  CEP  +  HBase  +  HDFS  (Parquet)  

Page 39: HBase and Impala Notes - Munich HUG - 20131017

39  

Time  Series  Example  

•  OpenTSDB  writes  the  metric  events  bucketed  by  metric  ID  and  then  Gmestamp  •  Helps  using  all  servers  in  the  cluster  equally  

•  During  reporGng/dashboarding  the  data  is  read  for  specific  metrics  within  a  specific  (me  frame  

•  Sorted  data  translates  into  effec(ve  use  of  Scan  with  start  and  stop  rows  

Page 40: HBase and Impala Notes - Munich HUG - 20131017

40  

Final  Notes  

Since  the  HBase  scan  performance  is  mainly  influenced  by  number  of  rows  scanned  you  need  to  issue  queries  that  are  selecGve,  i.e.  scan  only  certain  rows  and  not  the  en(re  table.    

This  requires  WHERE  clauses  with  the  HBase  row  key  in  it:    

SELECT f1, f2, f3 FROM mapped_table

WHERE key >= "user1234" AND key < "user1235";  

“Scan  all  rows  for  user  1234,  i.e.  that  have  a  row  key  starGng  with  user1234”  -­‐  might  be  a  composite  key!  

Page 41: HBase and Impala Notes - Munich HUG - 20131017

41  

Example  

Page 42: HBase and Impala Notes - Munich HUG - 20131017

42  

Final  Notes  

Not  using  the  primary  HBase  index,  aka  row  key,  results  in  a  full  table  scan  and  might  need  much  longer  (when  you  have  a  large  table.    SELECT f1, f2, f3 FROM mapped_table

WHERE f1 = ”value1” OR f20 < ”200";  

This  will  result  in  a  full  table  scan.  Remember:  it  is  all  just  physics!  

Page 43: HBase and Impala Notes - Munich HUG - 20131017

43  

Final  Notes  

Impala  also  uses  SingleColumnValueFilter  from  HBase  to  reduce  transferred  data    •  Filters  out  enGre  rows  by  checking  a  given  column  value  

•  Does  not  skip  rows  since  no  index  or  Bloom  filter  is  available  to  help  idenGfy  the  next  match  

 

Overall  this  helps  yet  cannot  do  any  magic  (physics  again!)  

Page 44: HBase and Impala Notes - Munich HUG - 20131017

44  

Final  Notes  

Some  advice  on  Tall-­‐narrow  vs.  flat-­‐wide  table  layout:  Store  data  in  a  tall  and  narrow  table  since  there  is  currently  no  support  for  scanner  batching  (i.e.  intra  row  scanning).  Mapping,  for  example,  one  million  HBase  columns  into  SQL  is  fu(le.  This  is  sGll  true  for  Hive’s  Map  support,  since  the  enGre  row  has  to  fit  into  memory!  

Page 45: HBase and Impala Notes - Munich HUG - 20131017

45  

Outlook  

Future  work:  •  Composite  keys:  map  mul(ple  SQL  fields  into  a  single  composite  HBase  row  key  

•  Expose  KV  versions  to  SQL  schema  •  BeLer  predicate  pushdown  

•  Advanced  filter  or  indexes?  

Page 46: HBase and Impala Notes - Munich HUG - 20131017

46  

Ques(ons?  

@larsgeorge  [email protected]