Witold Litwin [email protected] Riad Mokadem Riad.Mokadem @dauphine.fr

21
Witold Litwin [email protected] Riad Mokadem Riad.Mokadem@dauphine.fr Thomas Schwartz [email protected] Disk Backup Through Algebraic Disk Backup Through Algebraic Signatures Signatures For For A Scalable Distributed Data Structure A Scalable Distributed Data Structure in SDDS-2002 System in SDDS-2002 System

description

Disk Backup Through Algebraic Signatures For A Scalable Distributed Data Structure in SDDS-2002 System. Witold Litwin [email protected] Riad Mokadem Riad.Mokadem @dauphine.fr Thomas Schwartz [email protected]. Plan. Introduction The SDDS-2002 Backup Scheme - PowerPoint PPT Presentation

Transcript of Witold Litwin [email protected] Riad Mokadem Riad.Mokadem @dauphine.fr

Page 1: Witold Litwin    Witold.Litwin@dauphine.fr Riad Mokadem  Riad.Mokadem @dauphine.fr

Witold Litwin [email protected]

Riad Mokadem [email protected]

Thomas Schwartz [email protected]

Disk Backup Through Algebraic Disk Backup Through Algebraic SignaturesSignatures

For ForA Scalable Distributed Data A Scalable Distributed Data

StructureStructurein SDDS-2002 Systemin SDDS-2002 System

Page 2: Witold Litwin    Witold.Litwin@dauphine.fr Riad Mokadem  Riad.Mokadem @dauphine.fr

2

PlanPlan

IntroductionIntroduction

The SDDS-2002 Backup SchemeThe SDDS-2002 Backup Scheme

Experimental performance analysis.Experimental performance analysis.

Conclusion.Conclusion.

Page 3: Witold Litwin    Witold.Litwin@dauphine.fr Riad Mokadem  Riad.Mokadem @dauphine.fr

3

IntroductionIntroduction

Need for RAM SDDS storage to the disk File Backup

Failure of a server

File Eviction Sharing of RAM

Among different SDDS files With other apps

Page 4: Witold Litwin    Witold.Litwin@dauphine.fr Riad Mokadem  Riad.Mokadem @dauphine.fr

4

IntroductionIntroduction

Write to the disk only the parts (pages) changed since last backup “Dirty bit” approach inapplicable Page signature calculus: a possibility provided that:

Fast Precise Scalable

Shorter signatures may become longer without total recalculus

Not the case of SHA-1 nor of any other previous proposed schema

Page 5: Witold Litwin    Witold.Litwin@dauphine.fr Riad Mokadem  Riad.Mokadem @dauphine.fr

5

The SDDS-2002 Backup SchemeThe SDDS-2002 Backup Scheme File BackupFile Backup

Client

…… … …Server RAM Buckets

Server Disks

Store command Multicast)

Distributed Distributed StoringStoring

Page 6: Witold Litwin    Witold.Litwin@dauphine.fr Riad Mokadem  Riad.Mokadem @dauphine.fr

6

The SDDS-2002 Backup SchemeThe SDDS-2002 Backup Scheme File LoadFile Load

Client

…… …

Load command Multicast)

Server RAM Buckets

Server Disks

Distributed Distributed LodingLoding

Page 7: Witold Litwin    Witold.Litwin@dauphine.fr Riad Mokadem  Riad.Mokadem @dauphine.fr

7

Internal Organization of Bucket in Internal Organization of Bucket in SDDSSDDS

En-tête

Index SDDS B+-tree

Pages de donnéesData FileData File

Index : a few Kbytes up to MByteIndex : a few Kbytes up to MByte

Data file : Dozens of Mbytes up to GBytes Data file : Dozens of Mbytes up to GBytes

Page 8: Witold Litwin    Witold.Litwin@dauphine.fr Riad Mokadem  Riad.Mokadem @dauphine.fr

8

Page GranularityPage Granularity Carefull choice

Smaller page More individual writes if many random updates Less data transferred if a few updades

Larger pages Vice versa

Optimal size ? Good question

Our choice 16 KB for data

Although 64 KB pages proved best for data page signature calculus speed

256 B for index

Page 9: Witold Litwin    Witold.Litwin@dauphine.fr Riad Mokadem  Riad.Mokadem @dauphine.fr

9

Page SignaturePage Signature Algebraic SignaturesAlgebraic Signatures

• Galois Field GF (Galois Field GF (216) )

• Log / Antilog multiplicationLog / Antilog multiplication

• Page Page P P has 2-byte symbols has 2-byte symbols pp11 , p , p22, ….p, ….p

nn

• The signature formula is : The signature formula is :

• for each for each p’i = antilog p’i

•for each for each = : = : , 2, 3…

Sign ( P )= p’i i i = 1..n

Sign (P)= (Sign ( P ), Sign 2( P ),…Sign m( P ))

We put m = 2 to SDDS-2002

i=1,2...ni=1,2...n

Page 10: Witold Litwin    Witold.Litwin@dauphine.fr Riad Mokadem  Riad.Mokadem @dauphine.fr

10

Experimental Performance AnalysisExperimental Performance AnalysisHardware ConfigurationHardware Configuration

1.8 GHz P4 Servers1.8 GHz P4 Servers 800 MHz P3 Client 800 MHz P3 Client 500 MHz P3 Name Server500 MHz P3 Name Server 1 Gbs Ethernet1 Gbs Ethernet Windows 2000 Server OSWindows 2000 Server OS

Page 11: Witold Litwin    Witold.Litwin@dauphine.fr Riad Mokadem  Riad.Mokadem @dauphine.fr

11

Experimental Performance Experimental Performance SDDS-2002SDDS-2002

Initial File Store Time (No Signature Calculus)Initial File Store Time (No Signature Calculus)

11 2 3 4 2 3 4 File serversFile servers

Time Time

(Sec)(Sec)

120120

100100

8080

6060

4040

2020

File Size: 393MOFile Size: 393MO

25 000 Records25 000 Records

Page 12: Witold Litwin    Witold.Litwin@dauphine.fr Riad Mokadem  Riad.Mokadem @dauphine.fr

12

Initial File Store TimeInitial File Store Time(Time Series) (Time Series)

0

20000

40000

60000

80000

100000

120000

140000

100 150 1000 10000 25000

One Serv

Tw o Serv

Tree Serv

Number of Number of recordrecord

Storage Storage Time Time

(Ms)(Ms)

Page 13: Witold Litwin    Witold.Litwin@dauphine.fr Riad Mokadem  Riad.Mokadem @dauphine.fr

13

FileFile Load TimeLoad Time

120120

100100

8080

6060

4040

2020

11 2 3 4 2 3 4

(Sec)(Sec)

# of servers# of servers

File Size :File Size :

393MO393MO

Practically the same as the 1Practically the same as the 1stst backup time backup time

Page 14: Witold Litwin    Witold.Litwin@dauphine.fr Riad Mokadem  Riad.Mokadem @dauphine.fr

14

File Storage Performance AnalysisFile Storage Performance Analysis

Bucket

size (MB)

Number of

record

Signature

calculus (ms)

Signature

Calculus

per/MB

(ms)

Totalstore time (ms)

Store time

for 0 % change

(ms)

Gain (%)

Store time for 5 %

change

(ms)

Gain(%)

1.88 100 46 24.46 562 50 91.1 65 88.43

2.7 150 78 28.8 781 82 89.51 95 87.83

17.6 1000 438 24.88 5078 438 91.38 453 91.07

158 10000 4068 25.74 46406 4071 91.23 4085 91.19

393 25000 11003 27.9 117859

11003 91.33 11018 90.65

Page 15: Witold Litwin    Witold.Litwin@dauphine.fr Riad Mokadem  Riad.Mokadem @dauphine.fr

15

SHA-1 / Algebraic SignaturesSHA-1 / Algebraic SignaturesBucket

size(Mb)

Number of

record

Algebraic

signature

calculus (ms)

SHA-1

calculus (ms)

Initial Store time with

SHA-1(ms)

Initial Store time with alg. sign.(ms)

SHA-1

Store time for 5 %

change

(ms)

Alg. sign

Store time for 5 %

change

(ms)

Gain

(%)

1.88 100 46 70 602 562 85 65 30

2.7 150 78 103 799 781 119 95 25

17.6 1000 438 680 5278 5078 697 453 53

158 10000 4068 6088 47906 46406

6102 4085 49

393 25000 11003 15403

119342 117859

15418

11018

40

Page 16: Witold Litwin    Witold.Litwin@dauphine.fr Riad Mokadem  Riad.Mokadem @dauphine.fr

16

Algebraic / SHA-1 Signature Calculus TimeAlgebraic / SHA-1 Signature Calculus Time

02000400060008000

1000012000140001600018000

0 2 4 6

Bucket Size (MB)

Algebraic signature

Cryptographicsignature

Page 17: Witold Litwin    Witold.Litwin@dauphine.fr Riad Mokadem  Riad.Mokadem @dauphine.fr

17

ImplImpleementation in SDDS 2002mentation in SDDS 2002Interactive Client InterfaceInteractive Client Interface

User User interfaceinterface

Page 18: Witold Litwin    Witold.Litwin@dauphine.fr Riad Mokadem  Riad.Mokadem @dauphine.fr

18

ImplImpleementation in SDDS 2002mentation in SDDS 2002Execution Listing at the ServerExecution Listing at the Server

}}

1st Request for storage : 1st Request for storage : New File New File Signature Calculus (375 ms) Signature Calculus (375 ms) Disk write of all pages (4922 ms) Disk write of all pages (4922 ms)

2nd Request for storage : 2nd Request for storage : No changes found (375 ms) No changes found (375 ms)

3rd Request for storage : 3rd Request for storage : 1 page changed 1 page changed (375 + 16 ms) (375 + 16 ms)

Page 19: Witold Litwin    Witold.Litwin@dauphine.fr Riad Mokadem  Riad.Mokadem @dauphine.fr

19

ConclusionConclusion• The algebraic signature based file backup worksThe algebraic signature based file backup works

• Present in SDDS-2002 prototypePresent in SDDS-2002 prototype

• Offers advantages over the traditional approachOffers advantages over the traditional approach

• No change to existing codeNo change to existing code

• No run-time overheadNo run-time overhead

• Future workFuture work

• SignaturesSignatures

•Calculus, Alg. Properties, Apps…Calculus, Alg. Properties, Apps…

• Automatic SDDS File eviction Automatic SDDS File eviction

Page 20: Witold Litwin    Witold.Litwin@dauphine.fr Riad Mokadem  Riad.Mokadem @dauphine.fr

Thank You Thank You forfor

Your AttentionYour Attention

Page 21: Witold Litwin    Witold.Litwin@dauphine.fr Riad Mokadem  Riad.Mokadem @dauphine.fr