20121022 tm hbasecanarytool

9
2012/10/22 Scott Miao HBase Canary Tool

Transcript of 20121022 tm hbasecanarytool

Page 1: 20121022 tm hbasecanarytool

2012/10/22 Scott Miao

HBase Canary Tool

Page 2: 20121022 tm hbasecanarytool

Another way to monitor HBase processes

• org.apache.hadoop.hbase.tool.Canary– Be used to do "canary monitoring" of a running HBase cluster.– For each region tries to get one row per column family and

outputs some information about failure or latency

https://issues.apache.org/jira/browse/HBASE-4393

Usage: bin/hbase org.apache.hadoop.hbase.tool.Canary [opts] [table 1 [table 2...]]where [opts] are: -help Show this help and exit. -daemon Continuous check at defined intervals. # 6sec -interval <N> Interval between checks (sec) # specify how many secs you want

Page 3: 20121022 tm hbasecanarytool

Canary Tool study

private void sniff() throws Exception { for (HTableDescriptor table : admin.listTables()) { sniff(table); } } private void sniff(HTableDescriptor tableDesc) throws Exception {

HTable table = null;

try { table = new HTable(admin.getConfiguration(), tableDesc.getName()); } catch (TableNotFoundException e) { return; }

for (HRegionInfo region : admin.getTableRegions(tableDesc.getName())) { try { sniffRegion(region, table); } catch (Exception e) { sink.publishReadFailure(region); } } }

Page 4: 20121022 tm hbasecanarytool

Canary Tool study

private void sniffRegion(HRegionInfo region, HTable table) throws Exception { HTableDescriptor tableDesc = table.getTableDescriptor(); for (HColumnDescriptor column : tableDesc.getColumnFamilies()) { Get get = new Get(region.getStartKey()); get.addFamily(column.getName());

try { long startTime = System.currentTimeMillis(); table.get(get); long time = System.currentTimeMillis() - startTime;

sink.publishReadTiming(region, column, time); } catch (Exception e) { sink.publishReadFailure(region, column); } } }

Page 5: 20121022 tm hbasecanarytool

Canary Tool study

public interface Sink { public void publishReadFailure(HRegionInfo region); public void publishReadFailure(HRegionInfo region, HColumnDescriptor column); public void publishReadTiming(HRegionInfo region, HColumnDescriptor column, long msTime); }

public static class StdOutSink implements Sink { public void publishReadFailure(HRegionInfo region) { LOG.error(String.format("read from region %s failed", region.getRegionNameAsString())); }

public void publishReadFailure(HRegionInfo region, HColumnDescriptor column) { LOG.error(String.format("read from region %s column family %s failed", region.getRegionNameAsString(), column.getNameAsString())); }

public void publishReadTiming(HRegionInfo region, HColumnDescriptor column, long msTime) { LOG.info(String.format("read from region %s column family %s in %dms", region.getRegionNameAsString(), column.getNameAsString(), msTime)); } }

Page 6: 20121022 tm hbasecanarytool

Canary Tool study

//constructorspublic Canary() { this(new StdOutSink()); }

public Canary(Sink sink) { this.sink = sink; }

Page 7: 20121022 tm hbasecanarytool

Canary Tool in Circus

hbase-canary.log

Tm-puppet operation server

Canary-tool

Start here

Write to /var/log/hbase/

Nagios Server

Read from

Send mail if any abnormal

Fix problem

Page 8: 20121022 tm hbasecanarytool

Canary Tool in Circus

private static class CustomSink implements Canary.Sink { public void publishReadFailure(HRegionInfo regionInfo) { //... LOG.error(String.format("Read from table:%s, region:%s failed", tableName, regionName)); } public void publishReadFailure(HRegionInfo regionInfo, HColumnDescriptor colDescriptor) { //... LOG.error(String.format("Read from table:%s, region:%s, columnFamily:%s failed", tableName, regionName, colFamilyName)); } public void publishReadTiming(HRegionInfo regionInfo, HColumnDescriptor colDescriptor, long msTime) { //... LOG.info(String.format("Read from table:%s, region:%s, columnFamily:%s in %dms", tableName, regionName, colFamilyName, msTime)); }}

com.trendmicro.spn.ops.hbase.RunCanaryTool

Page 9: 20121022 tm hbasecanarytool

Canary Tool in Circus

public static void main(String[] args) throws Exception { Canary canary = new Canary(new CustomSink()); int exitCode = ToolRunner.run(canary, args); System.exit(exitCode);}

su - hbase <<EOFkinit -kt /etc/hbase/conf/hbase.keytab hbase/$(hostname -f)java -cp $CLASSPATH com.trendmicro.spn.ops.hbase.RunCanaryTool $@EOF

hbase-canary-monitor.sh

com.trendmicro.spn.ops.hbase.RunCanaryTool