Post on 13-Jul-2015
Presented atHBaseCon 2012
Page 2
The AOL Mail SystemOver 15 years old
Constantly evolving
10,000+ hosts
70+ Million mailboxes
50+ Billion emails
A technology stack that runs the gamut
Presented atHBaseCon 2012
Page 3
What that means…Lots of data
Lots of moving parts
Tight SLAs
Mature system + Young software = Tough marriageWe don’t buy “commodity” hardware
Engrained Dev/QA/Prod product lifecycle
Somewhat “version locked” to tried-and-true platforms
Expect service outages to be quickly mitigated by our NOC w/out waiting for an on-call
Presented atHBaseCon 2012
Page 4
So where does HBase fit?It’s a component, not the foundation
Currently used in two places
Being evaluated for moreIt will remain a tool in our diverse Big Data arsenal
Presented atHBaseCon 2012
Page 6
An “Activity Profiler”Watches for particular behaviors
Designed and built in 6/2010
Originally “vanilla” Hadoop 0.20.2 + HBase 0.90.2
Currently CDH3
1.4+ Million Events/min
60x 24TB (raw) DataNodes w/ local RegionServers
15x application hosts
Is an internal-only toolUsed by automated anti-abuse systems
Leveraged by data analysts for adhoc queries/MapRed
Presented atHBaseCon 2012
Page 8
Why the “Event Catcher” layer?Has to “speak the language” of our existing systems
Easy to plug an HBase translator in to existing data feeds
Hard to modify the infrastructure to speak HBase
Flume was too young at the time
Presented atHBaseCon 2012
Page 9
Why batch load via MapRed?Real time is not currently a requirement
Allows filtering at different points
Allows us to “trigger” eventsDesigned before coprocessors
Early data integrity issues necessitated “replaying”Missing append support early on
Holes in the Meta table
Long splits and GC pauses caused client timeouts
Can sample data into a “sandbox” for job development
Makes pig, hive, and other MapRed easy and stableWe keep the raw data around as well
Presented atHBaseCon 2012
Page 10
HBase and MapRed can live in harmonyBigger than “average” hardware
36+GB RAM
8+ cores
Proper system tuning is essentialGood information on tuning Hadoop is prolific, but…
XFS > EXT
JBOD > RAID
As far as HBase is concerned…
Just go buy Lars’ book
Careful job development, optimization is key!
Presented atHBaseCon 2012
Page 12
Contact History API Services a member-facing API
Designed and built in 10/2010
Modeled after the previous applicationBuilt by a different Engineering team
Used to solve a very different problem
250K+ Inserts/min
3+ Million Inserts/min during MapRed
20x 24TB (raw) DataNodes w/ local RegionServers
14x application hosts
Leverages Memcached to reduce query load on HBase
Presented atHBaseCon 2012
Page 15
Amusing mistakes to learn fromExploding regions
Batch inserts via MapRed result in fast, symmetrical key space growth
Attempting to split every region at the same time is a bad idea
Turning off region splitting and using a custom “rolling region splitter” is a good idea
Take time and load into consideration when selecting regions to split
Backups, backups, backups!You can never have to many
Large, non-splitable regions tell you thingsOur key space maps to accounts
Excessively large keys equal excessively “active” accounts