A paper by Yi-Bing Lin IEEE Transactions on Mobile Computing Vol. 4, No. 2, March/April ’05

38
1 A paper by Yi-Bing Lin IEEE Transactions on Mobile Computing Vol. 4, No. 2, March/April ’05 Presented by Derek Pennington Per-User Checkpointing For Mobility Database Failure Restoration

description

Per-User Checkpointing For Mobility Database Failure Restoration. A paper by Yi-Bing Lin IEEE Transactions on Mobile Computing Vol. 4, No. 2, March/April ’05 Presented by Derek Pennington. In GPRS & UMTS networks, the Home Location Register (HLR) - PowerPoint PPT Presentation

Transcript of A paper by Yi-Bing Lin IEEE Transactions on Mobile Computing Vol. 4, No. 2, March/April ’05

Page 1: A paper by Yi-Bing Lin IEEE Transactions on Mobile Computing Vol. 4, No. 2, March/April ’05

1

A paper by Yi-Bing LinIEEE Transactions on Mobile Computing

Vol. 4, No. 2, March/April ’05

Presented by Derek Pennington

Per-User CheckpointingFor Mobility Database Failure Restoration

Page 2: A paper by Yi-Bing Lin IEEE Transactions on Mobile Computing Vol. 4, No. 2, March/April ’05

2

In GPRS & UMTS networks, the Home Location Register (HLR)maintains the central database of user information. For a given user,an HLR record might contain information such as…

• Mobile Station (MS) Information– telephone number– International Mobile Subscriber Identity

• Service Information– subscription info– service restrictions– supplementary services

• Location Information– address of Serving GPRS Support Node (SGSN)

Page 3: A paper by Yi-Bing Lin IEEE Transactions on Mobile Computing Vol. 4, No. 2, March/April ’05

3

But what happens in the event of an HLR failure?

• Luckily, we periodically backup all of this user data (each backup is called a “checkpoint”).

• However, the paper’s author argues that the established backup practices have room for improvement.

Page 4: A paper by Yi-Bing Lin IEEE Transactions on Mobile Computing Vol. 4, No. 2, March/April ’05

4

Various approaches to checkpointing:

• All-record checkpoint– backup all users at once (eg: midnight)– costly (bottleneck effect)

• Per-user checkpoint– each user has its own timing mechanism for backups

• The paper’s author discusses the existing per-user checkpointing algorithm (henceforth referred to as “Algorithm 1”), and then proposes a new, improved one (“Algorithm 2”)

Page 5: A paper by Yi-Bing Lin IEEE Transactions on Mobile Computing Vol. 4, No. 2, March/April ’05

5

Introduction

2) Algorithm 1 vs. Algorithm 2

3) Modeling the Algorithms (the math part)

4) Performance Evaluation of Algorithms 1 & 2

5) Conclusions / Comments

Order of Presentation Coverage

Page 6: A paper by Yi-Bing Lin IEEE Transactions on Mobile Computing Vol. 4, No. 2, March/April ’05

6

• checkpoints happen at random intervals (tc)

• a checkpoint may occur whether or not any user registrations have taken place

• In the event of an HLR failure, if the user updated the HLR database, but that update didn’t get backed up, the record becomes obsolete

• When the user’s record is obsolete, the user will lose calls until he performs a registration with the HLR.

Algorithm 1

Page 7: A paper by Yi-Bing Lin IEEE Transactions on Mobile Computing Vol. 4, No. 2, March/April ’05

7

• checkpoint timers are scheduled for random intervals (tc)

• However, checkpoints will only take place when BOTH of the following are true:

• tp timer expires• a registration has taken place

• Like Algorithm 1, user will lose calls if his record(s) is/are obsolete

Algorithm 2

a checkpoint occurs whenever we return to State 0

Page 8: A paper by Yi-Bing Lin IEEE Transactions on Mobile Computing Vol. 4, No. 2, March/April ’05

8

For each scenario, will the user’s record(s) be

valid after the HLR recovers from its failure?#

4

3

2

1

Scenario

NONO

YESNO

YESYES

YESYES

Algorithm 2Algorithm 1

CP timer fires =

registration =

failure =

LEGEND

RECAP

Page 9: A paper by Yi-Bing Lin IEEE Transactions on Mobile Computing Vol. 4, No. 2, March/April ’05

9

Introduction

Algorithm 1 vs. Algorithm 2

3) Modeling the Algorithms (the math part)

4) Performance Evaluation of Algorithms 1 & 2

5) Conclusions / Comments

Order of Presentation Coverage

Page 10: A paper by Yi-Bing Lin IEEE Transactions on Mobile Computing Vol. 4, No. 2, March/April ’05

10

Two metrics are used to measure checkpoint algorithmperformance:

• E[tc]: the expected checkpoint interval– the larger the interval, the less frequent checkpoints will occur– essentially, checkpoint cost is proportional to checkpoint

frequency

: the probability that the user’s HLR record is obsolete after an HLR failure/recovery– the smaller “” is, the better the checkpoint algorithm’s

performance

Page 11: A paper by Yi-Bing Lin IEEE Transactions on Mobile Computing Vol. 4, No. 2, March/April ’05

11

Setting the checkpoint timer tp:

• typical approaches have a fixed tp

• However, this can lead to congestion with large numbers of users

• Thus, in Algorithms 1 & 2, tp is a random variable with exponential distribution

• Density function:

• …and, in Algorithm 1, since tc = tp from checkpoint to checkpoint, the expected checkpoint interval is:

ptpp etf

1

pcI tEtE

checkpoints per unit time

time between checkpoints

Page 12: A paper by Yi-Bing Lin IEEE Transactions on Mobile Computing Vol. 4, No. 2, March/April ’05

12

tm – m

m

tp – p

p

= “residual time” of tm

= “reverse residual time” of tm

= residual time of tp

= reverse residual time of tp

Finding for Algorithm 1:

Page 13: A paper by Yi-Bing Lin IEEE Transactions on Mobile Computing Vol. 4, No. 2, March/April ’05

13

Consider random variable t:

• probability density function: f(t)

• probability distribution function:

• expected value: E[t]

• Laplace transform:

Let be the residual time of t:

• probability density function:

• probability distribution function:

• Laplace transform:

t

ydyyftF

0)()(

Finding for Algorithm 1 (cont’d):

0

*

t

stdtetfsf

stE

sfsr

** 1

tE

Fr

1

0)()(

ydyyrR

Page 14: A paper by Yi-Bing Lin IEEE Transactions on Mobile Computing Vol. 4, No. 2, March/April ’05

14

• Also, the density function is the same:

• In Algorithm 1, we say the backup record is obsolete if, at the moment of HLR failure, the time since the last checkpoint is greater than the time since the last registration

• In other words:

tpp etftr

Finding for Algorithm 1 (cont’d):

mcI Pr

mcmm ddermc

c

m

0

m

mm tE

fr

** 1

integrals of the two density functions

stE

sfsr

** 1

from r*(s) defined earlier

Page 15: A paper by Yi-Bing Lin IEEE Transactions on Mobile Computing Vol. 4, No. 2, March/April ’05

15

• One difference between Algorithm 2 and Algorithm 1 is that the checkpoint timer will be reset based on how the previous checkpoint took place

– If the previous checkpoint happened due to a timeout event, then the next checkpoint interval is:

– If the previous checkpoint happened due to a registration event, then the next checkpoint interval is:

– Thus, in our state machine example, we actually have two “State 0”s…

pmc tt ,max *

Finding E[t] and for Algorithm 2:

pmc ttt ,max

Page 16: A paper by Yi-Bing Lin IEEE Transactions on Mobile Computing Vol. 4, No. 2, March/April ’05

16

Checkpoint occurring due to timeout event

Checkpoint occurring due to registration

Probability that a timeout will occur after a timeout-caused checkpoint

Probability that a timeout will occur after a registration-caused checkpoint

Probability that a registration will occur after a timeout-caused checkpoint

Probability that a registration will occur after a registration-caused checkpoint

Page 17: A paper by Yi-Bing Lin IEEE Transactions on Mobile Computing Vol. 4, No. 2, March/April ’05

17

• The random variable tc is now essentially a combination of the probability that the last checkpoint happened due to a timeout and the probability that the last checkpoint happened due to a registration:

• …where:

(x is the probability of being in State “x”)

pmpmc ttptpt ,max,max 2*

1

0201

011

p1

0201

022 1 pp

and

Page 18: A paper by Yi-Bing Lin IEEE Transactions on Mobile Computing Vol. 4, No. 2, March/April ’05

18

remember that comes from

)(1

)(

)(1

)(

*

*

*

*

md

mc

mb

ma

rp

rp

fp

fp

)(* mr

• Therefore, we can say:

timeout-caused checkpoint

registration-caused checkpoint

mc Pr

Page 19: A paper by Yi-Bing Lin IEEE Transactions on Mobile Computing Vol. 4, No. 2, March/April ’05

19

• Based on the figure, we can deduce some limiting probabilities:

• …which means we know more about p1 and p2:

)()(1

)(**

*

1

mm

m

rf

fp

02012

02011

022

011

2102011

bd

ac

pp

pp

)()(1

)(1**

*

2

mm

m

rf

rp

and

Page 20: A paper by Yi-Bing Lin IEEE Transactions on Mobile Computing Vol. 4, No. 2, March/April ’05

20

• From…

• …the density function for tc is:

• …where:

• …thus:

c

c

c

c

tcmcm

cmt

tcmcm

cmt

etftfTERM4

tFeTERM3

etrtrTERM2

tReTERM1

)()(

)(

)()(

)(

TERM4TERM3pTERM2TERM1ptf cc 21)(

cccc tcmcmcm

ttcmcmcm

tcc etftftFepetrtrtReptf )()()()()()( 21

pmpmc ttptpt ,max,max 2*

1

Page 21: A paper by Yi-Bing Lin IEEE Transactions on Mobile Computing Vol. 4, No. 2, March/April ’05

21

• The relationships between tp, tm, and allow us to reinterpret fc(tc) into two different pieces:

– fc1(tc): the situation where tp > tm

– fc2(tc): the situation where tp < tm

• We can reexpress fc(tc) as:

• …where:

)*()*()( 212 TERM4pTERM2ptf cc

)()(1

)()()(1

)()(1

)()()(**

*

**

*

mm

tcmcmm

mm

tcmcmm

rf

etftfr

rf

etrtrf cc

)*()*()( 211 TERM3pTERM1ptf cc

)()(1

)()(1

)()(1

)()(**

*

**

*

mm

cmt

m

mm

cmt

m

rf

tFer

rf

tRef cc

c

c

c

c

tcmcm

cmt

tcmcm

cmt

etftfTERM4

tFeTERM3

etrtrTERM2

tReTERM1

)()(

)(

)()(

)(

)()( 21 cccccc tftftf

*m

Page 22: A paper by Yi-Bing Lin IEEE Transactions on Mobile Computing Vol. 4, No. 2, March/April ’05

22

0ctcccccII dttfttE

**

*2

**

*1

2211

1

1

1

**

mm

m

mm

m

rf

rA

rf

fA

ApAp

Expected checkpoint interval for Algorithm 2:

What are A1 and A2?......

integral of the density function

plug-in p1 and p2

Page 23: A paper by Yi-Bing Lin IEEE Transactions on Mobile Computing Vol. 4, No. 2, March/April ’05

23

0001

c

c

cc

c

t ct

cmct ccmct ccmt

c dtetrtdttrtdttRetA

mm

sm

ms

m

Er

ds

sdrE

ds

ssr

d

*

*

*

0002

c

c

cc

c

t ct

cmct ccmct ccmt

c dtetftdttftdttFetA

mm tE

f

*

Page 24: A paper by Yi-Bing Lin IEEE Transactions on Mobile Computing Vol. 4, No. 2, March/April ’05

24

][)(

][)(

][*

2

*

1 mm

mm

cII tEf

pEr

ptE

So we can also express the expected checkpoint interval for Algorithm 2 as:

plug-in A1 and A2

Page 25: A paper by Yi-Bing Lin IEEE Transactions on Mobile Computing Vol. 4, No. 2, March/April ’05

25

• To find the probability of getting an obsolete record, there is no close-form expression when arbitrary fm(tm) is used

• The paper uses a mix-Erlang density function

– proven as a good approximation to other functions as well as measured data

– …and, as a comparison, the regular Erlang density function:

j

i

ti

i

nmi

immmi

i

en

tqtf

1

1

!1

Now we need to find II…

mtn

mm e

n

ttnf

!1

,,1

Page 26: A paper by Yi-Bing Lin IEEE Transactions on Mobile Computing Vol. 4, No. 2, March/April ’05

26

• Continuing, we have the Erlang distribution function:

• …and the Laplace transform expressed as:

mt

n

j

jm

m ej

ttnF

1

0 !1,,

Now we need to find II (cont’d)

n

jmtjf

1

,,1

1

n

ssnf

,,*

Page 27: A paper by Yi-Bing Lin IEEE Transactions on Mobile Computing Vol. 4, No. 2, March/April ’05

27

• The reverse residual time m of tm has:

– density function:

– distribution function:

– Laplace transform:

• And, since E[tm]=n/, we can say the following:

mnr ,,

Now we need to find II (cont’d)

mnR ,,

snr ,,*

n

n

jmm

n

jmm

snssnr

jFn

nR

jfn

nr

1,,

,,1

,,

,,1

,,

*

1

1

Page 28: A paper by Yi-Bing Lin IEEE Transactions on Mobile Computing Vol. 4, No. 2, March/April ’05

28

• For Algorithm 2, consider the two scenarios:

– A registration happens before the timeout• Checkpoint happens at the time of the timeout

– A registration does NOT happen before the timeout• In this case, we wait until the next registration to checkpoint

• To derive II, we only need to consider the first case…

• …where:

ct

n

mc

tcc tnFeptmF

neptf cc ,,,,

12

111

Now we need to find II (cont’d)

c

n

mc tngptmg

n

p,,,,

12

1

tiFetig t ,,,,

i

kk

kt tkfe

1

1

,,

Page 29: A paper by Yi-Bing Lin IEEE Transactions on Mobile Computing Vol. 4, No. 2, March/April ’05

29

• Then, the density function for the reverse residual time corresponding to g(i,,t) is:

• If we say that c is the reverse residual time of tc, then the density function for c is:

i

k

k

jk

kt tjf

ketih

1 1

1

,,,,

Now we need to find II (cont’d)

c

n

mccc nhpmh

n

pr ,,,, 2

1

11

Page 30: A paper by Yi-Bing Lin IEEE Transactions on Mobile Computing Vol. 4, No. 2, March/April ’05

30

• Finally, we can derive II:

• …where……… (next slide)

Now we need to find II (cont’d)

mcII Pr

m mcmcccm ddrnr

1,,

nAnApmAmAn

p

ddnhnrpddmhnrn

p

n

m

mccm

n

mmccm

mcmmcm

4321

431

01

20

1 ,,,,,,,,

Page 31: A paper by Yi-Bing Lin IEEE Transactions on Mobile Computing Vol. 4, No. 2, March/April ’05

31

Now we need to find II (cont’d)

m

m

1

mcm ddemrmAmc

c

m

03 ,,

mc

k

jc

m

kk

k

m ddjfk

mrmAmcm

11

1

04 ,,,,

k

j

m

kk

k

jmBk 11

1

,

• …where:

mcmm ddjfmrjmBmcm

,,,,,0

mm

m

im

j

l

dlfifm m

,,,,1

10

1

Page 32: A paper by Yi-Bing Lin IEEE Transactions on Mobile Computing Vol. 4, No. 2, March/April ’05

32

Introduction

Algorithm 1 vs. Algorithm 2

Modeling the Algorithms (the math part)

4) Performance Evaluation of Algorithms 1 & 2

5) Conclusions / Comments

Order of Presentation Coverage

Page 33: A paper by Yi-Bing Lin IEEE Transactions on Mobile Computing Vol. 4, No. 2, March/April ’05

33

Algorithm 2: Checkpoint Freq.vs. Registration Freq.• as registration frequency increases,

so does checkpoint frequency

• this is what we’d expect

Algorithm 2’s Checkpoint CostImprovement over Algorithm 1• According to the graph, as

registrations increase, Algorithm 2 further improves over Algorithm 1.

• ??? This is not what I would expect– p. 189: “If registration activities are

very frequent, then Alg. 2 behaves exactly the same as Alg. 1”

Page 34: A paper by Yi-Bing Lin IEEE Transactions on Mobile Computing Vol. 4, No. 2, March/April ’05

34

Algorithm 2: Probability of ObsoleteRecords After HLR Failure• X axis is 1/2, where 1 represents

intervals with few registrations and 2 represents intervals with many registrations

• Thus, as we move right, registrations become less frequent

• Less registrations means less chance of obsolete records, thus, this cost decreases as we move right

Algorithm 2’s Obsolete Record CostImprovement over Algorithm 1• Shows that Alg. 2 has a 20-55%

improvement over Alg. 1

• This makes sense, because Alg. 2 will checkpoint when a registration occurs if the checkpoint timer has expired… whereas Alg. 1 will have obsolete records in those situations.

Page 35: A paper by Yi-Bing Lin IEEE Transactions on Mobile Computing Vol. 4, No. 2, March/April ’05

35

Introduction

Algorithm 1 vs. Algorithm 2

Modeling the Algorithms (the math part)

Performance Evaluation of Algorithms 1 & 2

5) Conclusions / Comments

Order of Presentation Coverage

Page 36: A paper by Yi-Bing Lin IEEE Transactions on Mobile Computing Vol. 4, No. 2, March/April ’05

36

• Per the analytical results, Algorithm 2 improves upon Algorithm 1 in the following ways:– 50+% savings in checkpoint cost (E[tc])– 20-55% improvement in terms of reducing occurrences of

obsolete records ()

• Note that this paper does NOT discuss SGSN / VLR failure and/or recovery– all SGSN-based mobile user records are temporary and not

backed-up– other papers discuss SGSN failure restoration (see paper’s

references)

Conclusion

Page 37: A paper by Yi-Bing Lin IEEE Transactions on Mobile Computing Vol. 4, No. 2, March/April ’05

37

• This paper was heavy on the math, light on the explanations from step to step– Granted, maybe IEEE gave the author a requirement to fit within

6 pages of their magazine? 2x or 3x as long would make it much easier to follow.

• Derek’s recommended prerequisites:– know the difference between probability density functions and

probability distribution functions– know what a Laplace transform is– refresh your memory on integrals and derivations

• If, in fact, simulations were performed, include the details! He apparently omitted them on purpose. Maybe they’re included in his dissertation, thesis, etc…?

Comments

Page 38: A paper by Yi-Bing Lin IEEE Transactions on Mobile Computing Vol. 4, No. 2, March/April ’05

38

Thanks!

Any questions?