Team CMD Distributed Systems Team Report 2 1/17/07 C:\>members Corey Andalora Mike Adams Darren...

22
Team CMD Distributed Systems Team Report 2 1/17/07 C:\>members Corey Andalora Mike Adams Darren Stanley

Transcript of Team CMD Distributed Systems Team Report 2 1/17/07 C:\>members Corey Andalora Mike Adams Darren...

Team CMD Distributed Systems Team Report 21/17/07

C:\>members

Corey Andalora

Mike Adams

Darren Stanley

Paper 1

1. “Achieving Strong Consistency in a Distributed File System”– Peter Triantafillou and Carl Neilson– IEEE Transactions on Software Engineering, Vol. 23, No. 1

(pp. 35-55)– January 1997– http://doi.ieeecomputersociety.org/10.1109/32.581328

File Consistency

Paper 1 General Ideas

• Motivation– Increase the availability of files– Increase performance in a DFS

• Handle Failures– Server disk failure– Communication lost– Client failure

Important Terms

• File Session– Sequence of system calls from open to close

• Write-Sharing– Concurrent file access in conflicting modes

• Availability– The ability of a client to access a file

• Serializabililty– Series of concurrent executing actions on a file is equivalent to

serial execution

• File Caching– File data is stored in memory for quick retrieval and updates are

propagated to servers

Consistency Protocols

• Echo – Primary-site protocol

• Harp – Primary-copy server protocol

• Deceit – Decentralized protocol

• Coda – Local copy editing

Deceit

• Uses write-tokens to control file replication.• Server must acquire write-tokens in order for clients to

write to files.• When acquired all other servers hold unstable versions.• The token holder updates its file, tells all unstable

versions to update, and waits for a response back.• If all servers with the file respond, the token holder

broadcasts a stability notice to update files to stable.• Reads are always allowed on stable copies. If not

stable, request is forwarded to the token holder.

Protocol In Paper

• Three main entities– Servers (provide services)– Clients (use services)– Agents (process client requests)

• Sessions– Contacts (file server)– Agreements (between contact and agent)

• State Information– Which servers have a current copy of a file– What agreements exist for a file– The status of in-progress updates

Open Operation (Fig. 4)

1. Agent, A1, chooses server, S1, that is to be the contact.

2. S1 notifies a majority of the servers of the open request

3. All notified servers return their file state to S14. S1 determines the current state of the file5. S1 notifies any contacts with conflicting agreements

that were not notified in step 26. S1 acquires a current copy of the file, if its copy is stale7. S1 returns the results of the open() to the agent and if

the open() failed, tells the other servers of the failure

Close Operation (Fig. 6)

1. The client requests a close, and the agent sends any cached updates back to the contact.

2. The contact accepts the updates, and returns the call to the agent.

3. The contact propagates any updates to some, or all, of the other servers.

What we can use

• Use tokens to lock files– Must obtain token to write to file– Propagate updates when token is released– Mark files as unstable while token is in use

Paper 2

2. V. Henson and R. Henderson. Guidelines for Using Compare-by-Hash. Forthcoming, 2005. http://infohost.nmt.edu/~val/review/hash2.html

File Comparison

Paper 2 General Ideas

• Comparing file content by hash should be considered carefully.

• Hash collisions can occur although unlikely.• Collisions made more likely by non-random

files.• Computationally intensive hash computation

vs. network heavy direct file comparison.– If using hash functions for security no

performance is lost.

• Cryptographic hash functions are short lived.

General ideas continued

• If collision occurs we are unaware

• Alternatives- Compare whole file- Maintain state information about

files.

Guidelines

1. Compare-by-hash should provide some benefit. Save time, bandwidth, etc.

2. System should be usable if hash collisions can be generated at will.

3. The hashes should be able to regenerate with a different algorithm at any time.

Our Application

1. Save network usage. Provides faster file comparison, especially on slow or congested networks.

2. False information may be received during read if collisions can be created at will.

3. The hash algorithm is abstracted and can be easily replaced with a newer, more robust algorithm.

What we can use

• Use compare-by-hash and accept possible errors.– Represent all files with same hash as single

file.– When accessing cached files, check if it is

up to date by comparing hash.– Check hash values for security.

Paper 3

3. “Strong Security for Distributed File Systems”– Ethan Miller, Darrell Long, William Freeman, and Benjamin Reed– Performance, Computing, and Communications (pp. 34-40)– April 2001

Security

Paper 3 General Ideas

• Use cryptography at multiple levels– To protect data

• Hash data on both client and server – To be sure you're reading and writing what you wanted to

• Encrypt files + file hashes together– ensures nobody can change the data without you knowing

• Transparency– the user should have no idea the system is in place

Filesystem objects

– Certificate Objects• contains information about each user, as well as the public

key, and HMAC key

– File Objects• Contains file meta-data information

– Key Objects• Pairs a user with an encrypted key and permissions for a file or

groups of files.

– Data Objects• Contains the actual data of the file.

Operations

Operation Read Write host server host server

En/Decrypt yes no yes no

Hash yes no yes no

Signature no no yes no

Verification yes no no yes

What we can use

• We don't need something as fancy as the paper provides, but we can use some of its key points

– Hashing on the client side to verify data

– Encryption of communication between client/server

– Modified file system objects for efficient storage overhead

Tying It All Together

F1111

F1111

F2222

WF1: {Client0 (token), Client3}F2: {Client5, Client6, Client7}F3: {Client1, Client5}F4: {Client4 (token)}0

1

7

65

4

2

3F2222

X2222

F3333

F3333

F4444

Wabc

222 222

X2F2

Security

Hashing

Consistency