Review Session for Fourth Quiz

78
Review Session for Review Session for Fourth Quiz Fourth Quiz Jehan-François Pâris Jehan-François Pâris Summer 2011 Summer 2011

description

Review Session for Fourth Quiz. Jehan-François Pâris Summer 2011. Blue File System. Blue File System. According to the designers of the Blue System what are the two limitations of flash drives?. Blue File System. They can be lost. It is hard to keep them synchronized. Blue File System. - PowerPoint PPT Presentation

Transcript of Review Session for Fourth Quiz

Page 1: Review Session for Fourth Quiz

Review Session forReview Session forFourth QuizFourth Quiz

Jehan-François PârisJehan-François Pâris

Summer 2011Summer 2011

Page 2: Review Session for Fourth Quiz

Blue File System

Page 3: Review Session for Fourth Quiz

Blue File System

According to the designers of the Blue System what are the two limitations of flash drives?

Page 4: Review Session for Fourth Quiz

Blue File System

They can be lost. It is hard to keep them synchronized

Page 5: Review Session for Fourth Quiz

Blue File System

The Blue File System is said to have a dynamic storage hierarchy. What does it mean?

Page 6: Review Session for Fourth Quiz

Blue File System

The ranking of the storage devices in the storage hierarchy depends on their states.

A disk that is powered down will have a lower priority in the hierarchy than the remote server

A disk that is powered up will have a higher priority than the same server

Page 7: Review Session for Fourth Quiz

Blue File System

How does the Blue file system operate its device write queues?

Page 8: Review Session for Fourth Quiz

Blue File System

It empties them when it flushes them to disk.

Much more could be said.

Page 9: Review Session for Fourth Quiz

Blue File System

Explain how the Blue file system saves energy by aggregating writes to local disks.

Page 10: Review Session for Fourth Quiz

Blue File System

Aggregating writes to local disks saves energy by amortizing disk power state transitions across multiple writes.

Page 11: Review Session for Fourth Quiz

Blue File System

True or false:Most of the Blue FS functionality is handled by a user-level server.

Page 12: Review Session for Fourth Quiz

Blue File System

True

Page 13: Review Session for Fourth Quiz

PergamumPergamum

Page 14: Review Session for Fourth Quiz

PergamumPergamum

What equipment failures can be What equipment failures can be corrected by corrected by intratome redundancyintratome redundancy? ?

Page 15: Review Session for Fourth Quiz

PergamumPergamum

Irrecoverable read errorsIrrecoverable read errors

Page 16: Review Session for Fourth Quiz

PergamumPergamum

What would be the main drawback of a What would be the main drawback of a Pergamum system havingPergamum system having Plenty of Plenty of intratome redundancyintratome redundancy but but No No intertome redundancyintertome redundancy? ?

Page 17: Review Session for Fourth Quiz

PergamumPergamum

It would not tolerate full disk failures It would not tolerate full disk failures

Page 18: Review Session for Fourth Quiz

PergamumPergamum

How do intradisk parity blocks contribute How do intradisk parity blocks contribute to to reduce the power consumptionreduce the power consumption of of the system? the system?

Page 19: Review Session for Fourth Quiz

PergamumPergamum

They allow the local recovery of bad They allow the local recovery of bad blocks without having to power up other blocks without having to power up other tomestomes

Page 20: Review Session for Fourth Quiz

PergamumPergamum

What are the two main functions of What are the two main functions of Pergamum Pergamum digital signaturesdigital signatures??

Where are they stored? Where are they stored? Why?Why?

Page 21: Review Session for Fourth Quiz

PergamumPergamum

Their two main functions areTheir two main functions are To verify the integrity of the tome’s To verify the integrity of the tome’s

contentscontents By exchanging them with other By exchanging them with other

Pergamum tomes, to verify the Pergamum tomes, to verify the integrity of distributed data.integrity of distributed data.

Page 22: Review Session for Fourth Quiz

PergamumPergamum

Where are they stored? Where are they stored? Why?Why?

Page 23: Review Session for Fourth Quiz

PergamumPergamum

They are stored in a small flash drive so They are stored in a small flash drive so they can be consulted without powering they can be consulted without powering the tome’s hard drive.the tome’s hard drive.

Page 24: Review Session for Fourth Quiz

PergamumPergamum

What is What is disk scrubbingdisk scrubbing? ?

Page 25: Review Session for Fourth Quiz

PergamumPergamum

Disk scrubbing periodically verifies that Disk scrubbing periodically verifies that a given range of disk blocks can be a given range of disk blocks can be retrieved and reconstitutes the contents retrieved and reconstitutes the contents of the blocks that it could no access due of the blocks that it could no access due to an irrecoverable read error. to an irrecoverable read error.

Page 26: Review Session for Fourth Quiz

PergamumPergamum

Which feature of Pergamum reduces Which feature of Pergamum reduces the need for frequent full-disk scrubs?the need for frequent full-disk scrubs?

Page 27: Review Session for Fourth Quiz

PergamumPergamum

Pergamum intratome parity reduces Pergamum intratome parity reduces the need for frequent disk scrubs as it the need for frequent disk scrubs as it provides an additional way to provides an additional way to reconstitute the contents of the blocks reconstitute the contents of the blocks that caused irrecoverable read errors.that caused irrecoverable read errors.

Page 28: Review Session for Fourth Quiz

PergamumPergamum

How does Pergamum reconstitute data How does Pergamum reconstitute data contained on a tome that failed?contained on a tome that failed?

Page 29: Review Session for Fourth Quiz

PergamumPergamum

1.1. Pergamum replaces the failed tome by Pergamum replaces the failed tome by a new tomea new tome

2.2. One after the other, each tome in the One after the other, each tome in the same parity stripe as the failed tome same parity stripe as the failed tome sends its contents to the new tomesends its contents to the new tome

Page 30: Review Session for Fourth Quiz

PergamumPergamum

Why?Why?

Page 31: Review Session for Fourth Quiz

PergamumPergamum

To avoid powering up too many tomes To avoid powering up too many tomes at the same timeat the same time

Page 32: Review Session for Fourth Quiz

PergamumPergamum

How does the system’s workload—and How does the system’s workload—and intended use(s)-- affect the tradeoffs to intended use(s)-- affect the tradeoffs to consider when deciding the right amount consider when deciding the right amount of intra-disk and inter-disk redundancy in of intra-disk and inter-disk redundancy in a storage system?a storage system?

Page 33: Review Session for Fourth Quiz

PergamumPergamum

Intra-disk redundancy saves energy in Intra-disk redundancy saves energy in archival file systems because it allows larchival file systems because it allows local ocal reconstructionreconstruction of irrecoverable read errors of irrecoverable read errors

We might prefer using more inter-disk We might prefer using more inter-disk redundancy in conventional file systems as redundancy in conventional file systems as inter-disk redundancy protects data against inter-disk redundancy protects data against both both irrecoverable read errorsirrecoverable read errors andand disk disk failuresfailures..

Page 34: Review Session for Fourth Quiz

FARSITEFARSITE

Page 35: Review Session for Fourth Quiz

FARSITEFARSITE

How does FARSITE store users’ secret keys?

Why?

Page 36: Review Session for Fourth Quiz

FARSITEFARSITE

FARSITE encrypts the secret keys of its users with a symmetric key derived from user password and stores them in a globally-readable directory.

It does it because these keys are typically too long to be memorized by the user.

Page 37: Review Session for Fourth Quiz

FARSITEFARSITE

What characterizes a Byzantine failure?

Page 38: Review Session for Fourth Quiz

FARSITEFARSITE

1. The failed node keeps communicating with the other nodes

2. We have no easy way to detect such a failed node

Page 39: Review Session for Fourth Quiz

FARSITEFARSITE

How does Farsite guarantee the availability and the integrity its directory data?

Page 40: Review Session for Fourth Quiz

FARSITEFARSITE

Farsite replicates directory and manage them through a Byzantine fault-tolerant protocol that ensures their integrity (as long as less than one third of the machines misbehave in any manner).

Page 41: Review Session for Fourth Quiz

FARSITEFARSITE

In addition to using a Byzantine In addition to using a Byzantine agreement protocol in its directory host, agreement protocol in its directory host, which steps does Farsite take to protect which steps does Farsite take to protect user files against malicious behaviors by user files against malicious behaviors by its file hosts?its file hosts?

Page 42: Review Session for Fourth Quiz

FARSITEFARSITE

1.1. File blocks are encrypted so that file File blocks are encrypted so that file hosts cannot access their contents.hosts cannot access their contents.

2.2. File blocks are also replicated on File blocks are also replicated on different hosts so that a single file host different hosts so that a single file host cannot maliciously destroy a file.cannot maliciously destroy a file.

3.3. Farsite ensures that all copies of a given Farsite ensures that all copies of a given file block will be spread over machines file block will be spread over machines controlled by different owners.controlled by different owners.

Page 43: Review Session for Fourth Quiz

FARSITEFARSITE

You are to design a FARSITE file You are to design a FARSITE file system that can tolerate system that can tolerate two Byzantine two Byzantine failuresfailures..

What is the What is the minimumminimum number of number of members in each directory host?members in each directory host?

Page 44: Review Session for Fourth Quiz

FARSITEFARSITE

Each directory host should have at Each directory host should have at least least seven membersseven members

Page 45: Review Session for Fourth Quiz

FARSITEFARSITE

What is the What is the minimumminimum number of number of copies each data block should have? copies each data block should have?

Page 46: Review Session for Fourth Quiz

FARSITEFARSITE

Each data block should have at least Each data block should have at least three copies

Page 47: Review Session for Fourth Quiz

FARSITEFARSITE

What is a Sybil attack?What is a Sybil attack? How does Farsite protects itself against How does Farsite protects itself against

them?them?

Page 48: Review Session for Fourth Quiz

FARSITEFARSITE

A Sybill attack is an attack where one or A Sybill attack is an attack where one or more rogue nodes assume multiple more rogue nodes assume multiple identities.identities.

To prevent that, Farsite requires each To prevent that, Farsite requires each node entered the system to have a node entered the system to have a verifiable unique ID verifiable unique ID issued by a issued by a trusted authoritytrusted authority

Page 49: Review Session for Fourth Quiz

FARSITEFARSITE

Which actions does FARSITE take Which actions does FARSITE take when the owner of a file grants or when the owner of a file grants or revokes access to a given file?revokes access to a given file?

Page 50: Review Session for Fourth Quiz

FARSITEFARSITE

When the owner of a file grants access When the owner of a file grants access to the file to another user, FARSITE to the file to another user, FARSITE encrypts a copy of the file key with the encrypts a copy of the file key with the public key of the new user. When that public key of the new user. When that access is revoked, FARSITE deletes access is revoked, FARSITE deletes that copy. that copy.

Page 51: Review Session for Fourth Quiz

FARSITEFARSITE

How is the effect of a revoke different of How is the effect of a revoke different of that of the same revoke on a that of the same revoke on a conventional UNIX system?conventional UNIX system?

Page 52: Review Session for Fourth Quiz

FARSITEFARSITE

The user whose has lost the right to The user whose has lost the right to access the file could still be able to read access the file could still be able to read it if he/she has kept a copy of the file it if he/she has kept a copy of the file key on his/her own workstation. key on his/her own workstation.

Page 53: Review Session for Fourth Quiz

FARSITEFARSITE

What could FARSITE do to implement What could FARSITE do to implement the semantics of a UNIX access right the semantics of a UNIX access right revocation? revocation?

Page 54: Review Session for Fourth Quiz

FARSITEFARSITE

It would require encrypting the file with It would require encrypting the file with a new key.a new key.

Page 55: Review Session for Fourth Quiz

FARSITEFARSITE

What does FARSITE to improve its less What does FARSITE to improve its less than stellar than stellar response timeresponse time??

Hint: Answer has two partsHint: Answer has two parts

Page 56: Review Session for Fourth Quiz

FARSITEFARSITE

Files are cached for up to one week on Files are cached for up to one week on the client machinesthe client machines

Farsite uses background—”lazy”—Farsite uses background—”lazy”—propagation of directory updatespropagation of directory updates

Page 57: Review Session for Fourth Quiz

FarsiteFarsite

What is a What is a leaselease? ?

Page 58: Review Session for Fourth Quiz

FarsiteFarsite

AA lease lease is a time-limited contract is a time-limited contract between the file server and a client between the file server and a client guaranteeing that the server will not guaranteeing that the server will not accept any update for a given file or et accept any update for a given file or et of files during the duration of the lease of files during the duration of the lease without notifying first the client. without notifying first the client.

Typical lease durations are fairly Typical lease durations are fairly short.short.

Page 59: Review Session for Fourth Quiz

ZyzzyvaZyzzyva

Page 60: Review Session for Fourth Quiz

ZyzzyvaZyzzyva

Why may a Zyzzyva replica sometimes Why may a Zyzzyva replica sometimes store store two checkpointstwo checkpoints? ?

Page 61: Review Session for Fourth Quiz

ZyzzyvaZyzzyva

Zyzzyva replicas have two checkpoints Zyzzyva replicas have two checkpoints whenever their latest checkpoint whenever their latest checkpoint contains non-committed history. (That contains non-committed history. (That checkpoint is then called a checkpoint is then called a tentative tentative checkpointcheckpoint.) As a result, the replica .) As a result, the replica must keep its previous checkpoint until must keep its previous checkpoint until the new checkpoint becomes a the new checkpoint becomes a committed checkpoint. committed checkpoint.

Page 62: Review Session for Fourth Quiz

ZyzzyvaZyzzyva

When does a Zyzzyva When does a Zyzzyva tentative tentative checkpointcheckpoint becomes a becomes a committed committed checkpoint checkpoint ? ?

Page 63: Review Session for Fourth Quiz

ZyzzyvaZyzzyva

A checkpoint becomes a committed A checkpoint becomes a committed checkpoint as soon as all the history it checkpoint as soon as all the history it contains has become committed historycontains has become committed history

Page 64: Review Session for Fourth Quiz

ZyzzyvaZyzzyva

What are the four exchanges of What are the four exchanges of messages that occur during the messages that occur during the gracious execution of the Zyzzyva gracious execution of the Zyzzyva Byzantine fault-tolerant protocol?Byzantine fault-tolerant protocol?

Page 65: Review Session for Fourth Quiz

ZyzzyvaZyzzyva

Client sends a message to primary Client sends a message to primary replicareplica

Primary replica sends a message to Primary replica sends a message to all secondary replicas.all secondary replicas.

Secondary replicas send a message to Secondary replicas send a message to the client. the client.

Client send a message to all replicasClient send a message to all replicas(not included in the paper's figures) (not included in the paper's figures)

Page 66: Review Session for Fourth Quiz

FAWNFAWN

Page 67: Review Session for Fourth Quiz

FAWNFAWN

How is the FAWN How is the FAWN datastoredatastore or organized? ganized?

Page 68: Review Session for Fourth Quiz

FAWNFAWN

As a log operating in append mode

Page 69: Review Session for Fourth Quiz

FAWNFAWN

Why? Why?

Page 70: Review Session for Fourth Quiz

FAWNFAWN

Because flash memory performs sequential writes much faster than random writes

Page 71: Review Session for Fourth Quiz

FAWNFAWN

What is the purpose of allocating several What is the purpose of allocating several randomly selectedrandomly selected virtual nodesvirtual nodes to each to each FAWN node? FAWN node?

Page 72: Review Session for Fourth Quiz

FAWNFAWN

To spread the workload of a failed To spread the workload of a failed physical node among several physical physical node among several physical nodes nodes

Page 73: Review Session for Fourth Quiz

FAWNFAWN

Why do Pergamum and FAWN select Why do Pergamum and FAWN select very different CPUs for their nodes?very different CPUs for their nodes?

Page 74: Review Session for Fourth Quiz

FAWNFAWN

The CPU of a Pergamum tome controls a The CPU of a Pergamum tome controls a hard drive that is likely to be powered down hard drive that is likely to be powered down 90 to 95 percent of the time90 to 95 percent of the time Power savingsPower savings are paramount are paramount

The CPU of a FAWN node controls a faster The CPU of a FAWN node controls a faster flash drive that is very frequently accessedflash drive that is very frequently accessed Emphasis is on the best Emphasis is on the best

power-to-wattage ratiopower-to-wattage ratio

Page 75: Review Session for Fourth Quiz

FAWNFAWN

Consider a variant of Fawn tailored to a Consider a variant of Fawn tailored to a workload with workload with infrequent requestsinfrequent requests to a to a very large data setvery large data set

How would that affect your choice of a How would that affect your choice of a storage device?storage device?

Page 76: Review Session for Fourth Quiz

FAWNFAWN

We should store FAWN datastore on a We should store FAWN datastore on a disk drive as the capacity of the storage disk drive as the capacity of the storage device becomes more important than its device becomes more important than its access timesaccess times

Page 77: Review Session for Fourth Quiz

FAWNFAWN

How would your choice affect the How would your choice affect the organization of the in-memory hash tableorganization of the in-memory hash table—and the size of the main memory? —and the size of the main memory?

Page 78: Review Session for Fourth Quiz

FAWNFAWN

We would need a bigger main memory:We would need a bigger main memory: Many more hash table entriesMany more hash table entries Each hash table entries should contain Each hash table entries should contain

a much larger a much larger key fragment key fragment to to minimize false positivesminimize false positivesDisk reads are much more Disk reads are much more

expensive than flash memory readsexpensive than flash memory reads