Secondary Indexing in Phoenix - Hadoop Summit 2012 - HBase BoF

Post on 11-May-2015

1.707 views 2 download

Tags:

description

Overview of the secondary indexing implementation coming soon in Phoenix (https://github.com/forcedotcom/phoenix)

Transcript of Secondary Indexing in Phoenix - Hadoop Summit 2012 - HBase BoF

Secondary Indexing in Phoenix

Jesse YatesHBase CommitterSoftware Engineer

HBase BoF – June 25, 2013

HBase BoF - June 20132

Outline

• Motivation• History• HBase Consistent Indexing– Index Management– Recovery Mechanism

• Conclusion

HBase BoF - June 20133

A quick note…

HBase BoF - June 20134

Outline

• Motivation• History• HBase Consistent Indexing– Index Management– Recovery Mechanism

• Conclusion

HBase BoF - June 20135

Why do we need them?

• Sorted by key– Great for accessing on that key

What if we want to access by another dimension!?Table Scan!

HBase BoF - June 20136

A short example

• Easy to search by name of food

• Hard to search on another dimension

Name Type Date Received Manufacturer Current Count

Apple Macintosh 6/23/13 Good Farm Inc. 200

Turkey Breast 6/23/13 Tasty Meat Co. 42

Chicken Drumstick 6/18/13 Pretty Ok Food 3

Jam Strawberry 6/18/10 Mash It Up Inc. 700

HBase BoF - June 20137

A short exampleName Type Date Received Manufacturer Current Count

Apple Macintosh 6/23/13 Good Farm Inc. 200

Turkey Breast 6/23/13 Tasty Meat Co. 42

Chicken Drumstick 6/18/13 Pretty Ok Food 3

Jam Strawberry 6/18/10 Mash It Up Inc. 700

Date Received Name Type Manufacturer Current Count

6/18/13 Jam Strawberry Mash It Up Inc. 700

6/18/13 Chicken Drumstick Pretty Ok Food 3

6/23/13 Apple Macintosh Good Farm Inc. 200

6/23/13 Turkey Breast Tasty Meat Co. 42

HBase BoF - June 20138

Outline

• Motivation• History• HBase Consistent Indexing– Index Management– Recovery Mechanism

• Conclusion

HBase BoF - June 20139

HBase is “Special”…

• Partitioned Keys (“HRegion”)

• Scales because regions are independent

• Built-in data recovery mechanisms

HBase BoF - June 201310

Hasn’t someone tried this?

• Omid

• Percolator

• Culvert

• Lily

• TrendMicro

• Client-coordinated

HBase BoF - June 201311

We’ve gotten better…

• NGData– HBase-SEP– HBase-Indexer

• Intel– Lucene Full Text Indexing

HBase BoF - June 201312

Still missing some things

• In-HBase index storage– Just another table in HBase

• Simple consistency guarantees– If X fails, then Y

• Minimal overhead for covered indexes– Network roundtrips

HBase BoF - June 201313

Outline

• Motivation• History• HBase Consistent Indexing– Index Management– Recovery Mechanism

• Conclusion

14

Two Major Components

• Index Management– Build index updates– Ensures index is ‘cleaned up’

• Recovery Mechanism– Ensures index updates are “ACID”

HBase BoF - June 2013

HBase BoF - June 201315

Index Management

• Lives within a RegionCoprocesorObesrver• Access to the local Hregion• Specifies the mutations to apply to the index

tables

public interface IndexBuilder {public void setup(RegionCoprocessorEnvironment env);public Map<Mutation, String> getIndexUpdate(Put put);public Map<Mutation, String> getIndexUpdate(Delete delete);

}

HBase BoF - June 201316

Index Management

Key Observation #1

“We shouldn’t need to provide stronger guarantees than HBase - that is just asking for a bad time.”

- Jon Hsieh

HBase BoF - June 201317

* Paraphrased

*

HBase BoF - June 201318

HBase ACID

• Does NOT give you:– Cross-row consistency– Cross-table consistency

• Does give you:– Durable data on success– Visibility on success without partial rows

Key Observation #2

“Secondary indexing is inherently an easier problem than full transactions… secondary index updates are idempotent.”

- Lars Hofhansl

HBase BoF - June 201319

HBase BoF - June 201320

Idempotent Index Updates

• Doesn’t need full transactions

• Replay as many times as needed

• Can tolerate a little lag– As long as we get the order right

Taking a little ACID…

HBase BoF - June 201321

HBase BoF - June 201322

HBase BoF - June 201323

Durable Indexing: Standard Write Path

Client HRegion

RegionCoprocessorHost

WAL

RegionCoprocessorHost

MemStore

HBase BoF - June 201324

Durable Indexing: Standard Write Path

Client HRegion

RegionCoprocessorHost

WAL

RegionCoprocessorHost

MemStore

HBase BoF - June 201325

Durable Indexing

RegionCoprocessor

Host

WAL

RegionCoprocessorHost

Indexer IndexBuilder

WAL Updater

Durable!

IndexerIndex Table

Index TableIndex Table

HBase BoF - June 201326

Failure Situations

• Before writing the WAL– Nothing is durable, nothing is visible

HBase BoF - June 201327

Durable Indexing

Client HRegion

RegionCoprocessorHost

WAL

RegionCoprocessorHost

MemStore

Indexer

Indexer

Index TableIndex TableIndex Table

HBase BoF - June 201328

Failure Situations

• Before writing the WAL– Nothing is durable, nothing is visible

HBase BoF - June 201329

Failure Situations

• Before writing the WAL– Nothing is durable, nothing is visible

• After writing WAL, before index update– WAL Replay updates the index table and the

primary table

HBase BoF - June 201330

Durable Indexing

Client HRegion

RegionCoprocessorHost

WAL

RegionCoprocessorHost

MemStore

Indexer

Indexer

Index TableIndex TableIndex Table

HBase BoF - June 201331

Failure Situations

• Before writing the WAL– Nothing is durable, nothing is visible

• After writing WAL, before index update– WAL Replay updates the index table and the

primary table

HBase BoF - June 201332

Failure Situations

• Before writing the WAL– Nothing is durable, nothing is visible

• After writing WAL, before index update– WAL Replay updates the index table and the

primary table• Mid-index update– WAL Replay finishes index update, primary table

update

HBase BoF - June 201333

Durable Indexing

Client HRegion

RegionCoprocessorHost

WAL

RegionCoprocessorHost

MemStore

Indexer

Indexer

Index TableIndex TableIndex Table

HBase BoF - June 201334

Failure Situations• Before writing the WAL– Nothing is durable, nothing is visible

• After writing WAL, before index update– WAL Replay updates the index table and the primary

table• Mid-index update– WAL Replay finishes index update, primary table

update

HBase BoF - June 201335

Failure Situations

• Before writing the WAL– Nothing is durable, nothing is visible

• After writing WAL, before index update– WAL Replay updates the index table and the primary

table• Mid-index update– WAL Replay finishes index update, primary table update

• After index updates, before primary– WAL Replay restores primary state, idempotently

applies index updates

HBase BoF - June 201336

Durable Indexing

Client HRegion

RegionCoprocessorHost

WAL

RegionCoprocessorHost

MemStore

Indexer

Indexer

Index TableIndex TableIndex Table

HBase BoF - June 201337

Failure Situations

• Before writing the WAL– Nothing is durable, nothing is visible

• After writing WAL, before index update– WAL Replay updates the index table and the primary

table• Mid-index update– WAL Replay finishes index update, primary table update

• After index updates, before primary– WAL Replay restores primary state, idempotently

applies index updates

HBase BoF - June 201338

Special Note: Failed Index Updates

• Index is corrupted– Index Table does not exist– Index table does not have write schema– Etc.

• Fail-fast behavior– Kill the whole server– Forces WAL Replay to enforce correctness– Modular enough to support alternative schemes

HBase BoF - June 201339

Key Points

• Custom KeyValues to enable index durability in primary table WAL

• Custom WALEdit Codec for index update with WAL Replay

• Will see index updates before primary– Only a little bit of lag and never ‘wrong’– Matches HBase consistency

• Fail-fast behavior to enforce correctness

HBase BoF - June 201340

Upcoming Work

• Performance testing

• Standard covered index managers

• Index cleanup on compaction

HBase BoF - June 201341

Outline

• Motivation• History• HBase Consistent Indexing– Index Management– Recovery Mechanism

• Conclusion

HBase BoF - June 201342

Conclusion

• Fully transparent to client

• Easy to build custom index maintenance

• Meets current HBase consistency guarantees

• Supports HBase 0.94.9+– Coming to 0.96/0.98 soon!

hbase-index

HBase BoF - June 201343

https://github.com/forcedotcom/phoenix/tree/master/contrib/hbase-index

Detailed Blog Post

HBase BoF - June 201344

http://jyates.github.io/2013/06/11/hbase-consistent-secondary-indexing.html

HBase BoF - June 201345

Bonus!

• Usable as a standalone module

• Coming to phoenix*– Built-in support

• Future: added to HBase core (?)

* https://github.com/forcedotcom/phoenix

Thanks! Questions!

HBase BoF - June 201346

@jesse_yatesjesse.k.yates@gmail.com