Post on 16-Jan-2016
MAHADEV KONAR
Apache ZooKeeper
What is ZooKeeper?
A highly available, scalable, distributed coordination kernel
Use Cases
» Leader Election» Group Membership» Work Queues» Event Notifications/workflow management» Configuration Management» Cluster Management » Sharding
What is ZooKeeper again?
File api without partial reads/writesNo renamesOrdered updates and strong persistence
guaranteesConditional updates (version)Watches for data changesEphemeral znodesGenerated file names
Data Model
Hierarchal namespace
Each znode has data and children
data is read and written in its entirety
/
apps
users
locks
servers
app1
read-1
master
regionserver
ZooKeeper API
String create(path, data, acl, flags)
void delete(path, expectedVersion)
Stat setData(path, data, expectedVersion)
(data, Stat) getData(path, watch)
Stat exists(path, watch)
String[] getChildren(path, watch)
ZooKeeper Service
All servers store a copy of the data (in memory) A leader is elected at startup Followers service clients, all updates go through leader Update responses are sent when a majority of servers have persisted the
change
ZooKeeper Service
ServerServer ServerServerServerServer
Leader
Client ClientClientClientClient ClientClient
ZooKeeper and HBase
Master Failover
Region Servers and Master discovery via ZooKeeper HBase clients connect to ZooKeeper to find
configuration data Region Servers and Master failure detecti0n
Hbase and ZooKeeper as of now!
/
root-region-server
rs
master
• Master • If more than one master, they fight
• Root Region Server• This znode holds the location of the
server hosting the root of all tables in hbase
• rs• A directory in which there is a znode
per Hbase region server• Region Servers register themselves with
ZooKeeper when they come online • On Region Server failure (detected via ephemeral znodes and notification via ZooKeeper), the master splits the edits out per region
shutdown
Common Problems/Error Cases
Garbage Collection at the Region Servers Causes zookeeper clients to stall
Session expiry
Low throughput and connection loss Mostly due to under provisioned ZooKeeper instances Disk and Memory usage
Bad Usage example: NameNode, RegionServer, JobTracker, ZooKeeper
running on the same node
Release 3.3.0, whats in for Hbase?
Allow configuration of session timeout min/max bounds
HBase needs large session timeouts
Improved logging information to detect issues
Improved debugging toolsImproved documentationImproved performance and robustnessQueue implementation available
Upcoming 3.4 release
No ConnectionlossUse Netty - allow encryptionTesting
MockitoMore of backwards compatibility testing
More ZooKeeper in Hbase?
Table Schema and state in ZooKeeper read only, online
Region Server state transitions via ZooKeeper
Store region assignment in ZooKeeper for each Region Server
http://wiki.apache.org/hadoop/ZooKeeper/HBaseUseCases
Questions?