IBM Streams V4.1 and Incremental Checkpointing

14
© 2015 IBM Corporation Incremental Checkpointing IBM Streams Version 4.1 Fang Zheng Streams Development [email protected]

Transcript of IBM Streams V4.1 and Incremental Checkpointing

Page 1: IBM Streams V4.1 and Incremental Checkpointing

© 2015 IBM Corporation

Incremental Checkpointing

IBM Streams Version 4.1

Fang Zheng

Streams Development

[email protected]

Page 2: IBM Streams V4.1 and Incremental Checkpointing

2 © 2015 IBM Corporation

Agenda

Introduction to Incremental Checkpointing

How It Works

How to Use It

Case Study: VWAP Application

Page 3: IBM Streams V4.1 and Incremental Checkpointing

3 © 2015 IBM Corporation

Important Disclaimer

THE INFORMATION CONTAINED IN THIS PRESENTATION IS PROVIDED FOR INFORMATIONALPURPOSES ONLY.

WHILE EFFORTS WERE MADE TO VERIFY THE COMPLETENESS AND ACCURACY OF THEINFORMATION CONTAINED IN THIS PRESENTATION, IT IS PROVIDED “AS IS”, WITHOUT WARRANTYOF ANY KIND, EXPRESS OR IMPLIED.

IN ADDITION, THIS INFORMATION IS BASED ON IBM’S CURRENT PRODUCT PLANS AND STRATEGY,WHICH ARE SUBJECT TO CHANGE BY IBM WITHOUT NOTICE.

IBM SHALL NOT BE RESPONSIBLE FOR ANY DAMAGES ARISING OUT OF THE USE OF, OROTHERWISE RELATED TO, THIS PRESENTATION OR ANY OTHER DOCUMENTATION.

NOTHING CONTAINED IN THIS PRESENTATION IS INTENDED TO, OR SHALL HAVE THE EFFECT OF:

• CREATING ANY WARRANTY OR REPRESENTATION FROM IBM (OR ITS AFFILIATES OR ITS ORTHEIR SUPPLIERS AND/OR LICENSORS); OR

• ALTERING THE TERMS AND CONDITIONS OF THE APPLICABLE LICENSE AGREEMENTGOVERNING THE USE OF IBM SOFTWARE.

IBM’s statements regarding its plans, directions, and intent are subject to change orwithdrawal without notice at IBM’s sole discretion. Information regarding potentialfuture products is intended to outline our general product direction and it should notbe relied on in making a purchasing decision. The information mentioned regardingpotential future products is not a commitment, promise, or legal obligation to deliverany material, code or functionality. Information about potential future products maynot be incorporated into any contract. The development, release, and timing of anyfuture features or functionality described for our products remains at our solediscretion.

THIS INFORMATION IS BASED ON IBM’S CURRENT PRODUCT PLANS AND STRATEGY, WHICH ARE SUBJECT TO CHANGE BY IBM WITHOUT NOTICE.

IBM SHALL NOT BE RESPONSIBLE FOR ANY DAMAGES ARISING OUT OF THE USE OF, OR OTHERWISE RELATED TO, THIS PRESENTATION OR ANY OTHER DOCUMENTATION.

Page 4: IBM Streams V4.1 and Incremental Checkpointing

4 © 2015 IBM Corporation

Introduction to Incremental Checkpointing

Since Streams V4, an operator can implement the StateHandler interface to

checkpoint and reset its state

• StateHandler::checkpoint(Checkpoint & ckpt) is periodically called by Streams

Runtime to checkpoint operator state

• StateHandler::reset(Checkpoint & ckpt) is called upon operator restart to restore

operator state

By convention, each checkpoint contains the full operator state

When operator state is large, the time and space costs of checkpointing can be

high

This can in turn cause negative impact on tuple processing performance

Large checkpoints can also cause resource contention on checkpointing data store

Incremental checkpointing is a known technique to reduce checkpointing cost

• Save full operator state in a “Base” checkpoint

• Track changes made to operator state during normal processing

• For each subsequent checkpoint, only save the changed portion of state since

previous checkpoint to form a “Delta” checkpoint

Page 5: IBM Streams V4.1 and Incremental Checkpointing

5 © 2015 IBM Corporation

How Incremental Checkpointing Works

Why incremental checkpointing?

• If the size of a “Delta” checkpoint is smaller than a full-state checkpoint, then the

time and size of checkpoint is reduced

Incremental checkpointing is applicable to many streaming applications

Some operator state data is intensively read but rarely updated

A large sliding window which moves slowly

Only a small number of “hot” keys are updated in a large key-value map

Such opportunity exists in many analytics (Aggregate, Join, Sort, Dedup, TopK, …)

What’s in Streams V4.1

• Low-overhead change tracking and incremental checkpointing for commonly-used

data structures (in particular, all configurations of SPL windows)

• Automatically and dynamically adjust incremental checkpointing behavior to i)

improve overall application performance and ii) bound worst-case restoration cost

Page 6: IBM Streams V4.1 and Incremental Checkpointing

6 © 2015 IBM Corporation

How Incremental Checkpointing Works (Cont.)

An example: Incrementally checkpointing a queue

A B C D E F G H I J

B C D E F G H I J K

D E F G H I J K L M

Insert KDelete A

Insert LDelete BInsert MDelete C

A,B,C,D,E,F,G,H,I,J

Take Checkpoint #1

Take Checkpoint #2

K 1

L M 2

Take Checkpoint #3

Base checkpoint

Delta checkpoint

Delta checkpoint

new item Knumber of deletions: 1

new item L, Mnumber of deletions: 2

Queue

Page 7: IBM Streams V4.1 and Incremental Checkpointing

7 © 2015 IBM Corporation

How Incremental Checkpointing Works (Cont.)

An example: Restore the queue from a Delta checkpoint

A B C D E F G H I J

B C D E F G H I J K

D E F G H I J K L M

Step 2: retrieve Checkpoint #2, and re-apply changes in Checkpoint #2Insert KDelete A

Step 3: retrieve Checkpoint #3,and re-apply changes in Checkpoint #3Insert LInsert MDelete BDelete C

A,B,C,D,E,F,G,H,I,J

Step 1: Retrieve Checkpoint #1,De-serialize the queue from it

K 1

L M 2

new item Knumber of deletions: 1

new item L, Mnumber of deletions: 2

Queue

Page 8: IBM Streams V4.1 and Incremental Checkpointing

8 © 2015 IBM Corporation

How Incremental Checkpointing Works (Cont.)

An example: an unordered map

K0 V0

Insert (K10, V10)Update (K1) <- V1’Delete(K2, V2)Update (K1) <- V1’’

(K0,V0), (K1,V1), …, (K9,V9)

Take Checkpoint #1

Map

……

K1 V1

K2 V2

K9 V9

K0 V0

……

K1 V1’’

K9 V9

K10 V10

Base checkpoint

(K1,V1’’), (K10,V10) K2

Delta checkpointTake Checkpoint #2

Updated/inserted key-values

Plus deleted keys

Page 9: IBM Streams V4.1 and Incremental Checkpointing

9 © 2015 IBM Corporation

How Incremental Checkpointing Works (Cont.)

Issues with incremental checkpointing

• The overhead of change tracking must be paid off by savings in checkpointing cost

• What if operator state is changed completely between two checkpoints?

• The restoration cost is increased with incremental checkpointing

Our solution: dynamically adapt incremental checkpointing behavior

• The Streams Runtime continuously assesses whether incremental checkpointing is

beneficial, and adjusts the number of consecutive Delta checkpoints to take

• The worst-case restoration cost is bounded by capping number of Delta

checkpoints

• Automatically turn off incremental checkpointing when it’s not beneficial

• Such adaptiveness saves end user the burden of manual configuration and tuning

BASE …DELTA

TIMECHECKPOINT 1 CHECKPOINT 2 CHECKPOINT 3 CHECKPOINT 4 CHECKPOINT D+1 CHECKPOINT D+2…

DELTA DELTA DELTABASE

D: Number of consecutive Delta checkpoints

DELTA…

Page 10: IBM Streams V4.1 and Incremental Checkpointing

10 © 2015 IBM Corporation

How to Use Incremental Checkpointing?

For C++ operator developer

• The SPL Runtime provides two container classes with built-in change

tracking and incremental checkpointing capabilities:

• SPL::IncrDeque<T> for double-ended queue

• SPL::IncrUnorderedMap<K, V> for unordered map

• Operator code can enable incremental checkpointing for an SPL Window by

instantiate the window with incremental deque and map:

• Two CodeGen APIs are extended to help generating C++ codes for SPL

Window:

// an SPL window without incremental checkpointing capabilitySPL::Window<T, G, D=std::deque<T>, S=std::tr1::unordered_map<G,D> > myWindow;// an SPL window with incremental checkpointing capabilitySPL::Window<T, G, D=SPL::IncrDeque<T>, S=SPL::IncrUnorderedMap<G,D> > myWindow;

SPL::CodeGen::getWindowCppType()

SPL::CodeGen::getWindowEventCppType()

Page 11: IBM Streams V4.1 and Incremental Checkpointing

11 © 2015 IBM Corporation

How to Use Incremental Checkpointing? (Cont.)

For C++ operator developer

• Example: Instantiate SPL Window with incremental data types

# Code taken from Aggregate_h.cgt

# Check if the operator is configured to do checkpointing

# (in a Consistent Region or with config checkpoint clause configured)

my $isInConsistentRegion = $model->getContext()->getOptionalContext("ConsistentRegion");

my $ckptKind = $model->getContext()->getCheckpointingKind();

if ($isInConsistentRegion || $ckptKind ne "none") {

# Instantiate window with deque and unordered map which are of incremental checkpointing capability

$windowCppType = ($partitionByParam)

? SPL::CodeGen::getWindowCppType($window, $windowTupleType, 'PartitionByType',

'SPL::IncrDeque', 'SPL::IncrUnorderedMap')

: SPL::CodeGen::getWindowCppType($window, $windowTupleType, '', 'SPL::IncrDeque');

$windowEventCppType = ($partitionByParam)

? SPL::CodeGen::getWindowEventCppType($window, $windowTupleType,

'PartitionByType','SPL::IncrDeque','SPL::IncrUnorderedMap')

: SPL::CodeGen::getWindowEventCppType($window, $windowTupleType, '', 'SPL::IncrDeque');

}

Page 12: IBM Streams V4.1 and Incremental Checkpointing

12 © 2015 IBM Corporation

How to Use Incremental Checkpointing? (Cont.)

For SPL programmer

• The Join, Aggregate, and Sort operators instantiate their SPL windows with

incremental checkpointing if the operators are configured to do

checkpointing (in Consistent Region or with “config checkpoint” clause)

• No extra configuration is needed in SPL code

@consistent(trigger=periodic, period4.0)

stream<SourceFormat> Message = CustomSource() {

….

}

// Aggregate operator is in Consistent Region

stream< Message > VWAPAggregator0_0_1 as O = Aggregate(TradeFilter as I) {

window I : sliding, count(2000), count(1), partitioned;

param partitionBy : ric;

output O : svwap = Sum(myvwap), svolume = Sum(volume);

}

// Aggregate operator is in an Autonomous region and with config checkpoint clause

stream< Message > VWAPAggregator0_0_1 as O

= Aggregate(Message as I) {

window I : sliding, count(2000), count(1), partitioned;

param partitionBy : ric;

output O : svwap = Sum(myvwap), svolume = Sum(volume);

config checkpoint : periodic(10.0);

}

Page 13: IBM Streams V4.1 and Incremental Checkpointing

13 © 2015 IBM Corporation

Case Study: VWAP Application

VWAP Application

• Ingest financial transaction data streams

• Contains an Aggregate operator

• Run VWAP within a consistent region, varying the checkpointing period

• Run VWAP with Streams V4 which always checkpoints full operator state vs.

Streams V4.1 which does incremental checkpointing

Page 14: IBM Streams V4.1 and Incremental Checkpointing

14 © 2015 IBM Corporation

Questions?