HBaseCon 2012 | HBase powered Merchant Lookup Service at Intuit
HBaseCon 2012 | Overcoming Data Deluge with HBase to Help Save the Environment - OPower
-
Upload
cloudera-inc -
Category
Technology
-
view
1.113 -
download
0
description
Transcript of HBaseCon 2012 | Overcoming Data Deluge with HBase to Help Save the Environment - OPower
My life with HBase
Drawn to Drawn to ScaleScale
Drawn to Drawn to ScaleScale OpowerOpowerOpowerOpowerClouderaClouderaClouderaClouderaFactsetFactsetFactsetFactset
About Opower
Opower is a customer engagement platform for the utility industry
About Opower
Home energy reportsCustomized utility bills
Energy efficiency programs for utilities
About Opower
Opower runs on analyticsAnalytics run on Hadoop + HBase
Opower analysis relies on datafrom a variety of sources
» Electric Utility Usage Data
» Gas Utility Usage Data
2
4
3 1
Data Storage & Processing
Disaggregation Algorithms
Shared Energy Signature
Repository
OPOWER Platform
» Thermostat data
» Weather data
Opower’s first architecture could not support their analytic vision
MySQLScalability?
Performance? Data integration?
Opower’s first architecture could not support their analytic vision
Analytic workflow instead of analytic apps:
SQL -> CSV -> R -> too little, too slow
Problem #1 Data Lake Cost
Usage AMI Regional AMI Sensor Data Data Lake
Problem #2 Slower and slower queries
Smart-grid-scale dataLots of supporting data: weather, demographics, etc.
Problem #3 It was taking lots of “magic”
Intense analyticsStrange schemas
Segmented queries
Hadoop + HBase at Opower
Opower determined that they needed an entirely new data architecture
NexGen Architecture @ Opower
Hadoop + HBase at Opower
Early success: HBase AMI
What rocked
Endless, cheap scalability
What rocked
The analytics team loved it!
What sucked
Hard on the ops team – still trying to grok it
What suckedNoSchema p1.
Creating SchemaManaging MetaData
Schema <=> Performance
What sucked
HAFailover
Snapshots
What sucked
No secondary indexAggregation is slow (Rollup/OLAP)
Poor Client Performance
It would be better if only …
Developers were not forced to know how the data is stored, indexed, etc.
It would be better if only …
There were nicer APIs and better query languages (SQL?)
It would be better if only …
Version migrations were easyHierarchical Tables
It would be better if only …
Real-time tuning
It would be better if only …
Did I mention HA?
In summary
HBase has helped Opower achieve their analytic vision
But they’ve still got a long way to goHBase still has a long way to go