Distributed OS - TU Dresden fileSS 2011 Distributed OS / File System Scalability - Hermann Härtig 2...

16
04/05/11 Distributed OS File System Scalability (NFS → AFS →) G(oogle-)FS Hermann Härtig

Transcript of Distributed OS - TU Dresden fileSS 2011 Distributed OS / File System Scalability - Hermann Härtig 2...

04/05/11

Distributed OS

File System Scalability

(NFS → AFS →) G(oogle-)FS

Hermann Härtig

SS 2011 Distributed OS / File System Scalability - Hermann Härtig 2

Distributed FS

DFS

ClientApplication

LocalFS

DFS

SS 2011 Distributed OS / File System Scalability - Hermann Härtig 3

Distributed FS

DFS

ClientApplication

DFS

ClientApplication

LocalFS

DFS

SS 2011 Distributed OS / File System Scalability - Hermann Härtig 4

Distributed FS

LocalFS

DFSDFS

ClientApplication

DFS

ClientApplication

LocalFS

DFS

LocalFS

DFS

SS 2011 Distributed OS / File System Scalability - Hermann Härtig 5

Network File System (NFS, SUN v2)

Design Principles

● Interface: as close as possible to Unix

● names/”open”: message to file server for each part of pathname

● Read/Write: use block cached in client

● Cache: blocks

● Replication: none (only very rudimentary)

● Consistency: exchange consistency messages every few s

● Faults: write through on server (v2),

hold up client when server crashes

SS 2011 Distributed OS / File System Scalability - Hermann Härtig 6

Andrew File System

Design Principles

● Interface: as close as possible to Unix

● names/”open”: name resolution on client using caches directories

● Read/Write: use file cached in client

● Cache: whole files, use server replicates to fetch files to cache

● Replication: some (performance)

● Consistency: session, explicit invalidation of cache through “call

backs”

● Faults: ?

SS 2011 Distributed OS / File System Scalability - Hermann Härtig 7

Scalability limitations

● Very general usage model○ Posix consistency model

‒ cache validation (distributed caches)

‒ write semantics (need to load block or file before writing)○ Too many small messages

● One-to-one mapping of files

● Distribution-related metadata and user-data on same server

● No inherent fault tolerance

● No or simple replication

SS 2011 Distributed OS / File System Scalability - Hermann Härtig 8

Google File System

Design Principles

● Interface: specialized for google applications

● names/”open”: name resolution on “Master server”

● Read/Write: direct communication with “chunk servers”

● Cache: no caching of user data

● Replication: “chunks” are replicated

● Consistency: relaxed consistency (replicas need not be identical)

● Faults: inherent in design

Details: see slides and paper (presentation at SOSP 2003)

SS 2011 Distributed OS / File System Scalability - Hermann Härtig 9

Distributed FS

LocalFS

DFS

ClientApplication

DFS

ClientApplication

LocalFS“Chunks”Userdata

LocalFS

DFS

“Master”Metadata

SS 2011 Distributed OS / File System Scalability - Hermann Härtig 10

Architecture

Figure from Ref 2(see last slide)

SS 2011 Distributed OS / File System Scalability - Hermann Härtig 11

Write

Figure from Ref 2(see last slide)

SS 2011 Distributed OS / File System Scalability - Hermann Härtig 12

Write 2

● No concurrency control → inconsistencies

● But Atomic “Append” with at least once semantics

● Replicas not identical (padding, multiple/aborted writes, …)

→ applications must recognize “records”

→ checksum per chunk-replicate

● Applications' data structures must be designed to tolerate this

SS 2011 Distributed OS / File System Scalability - Hermann Härtig 13

Replicas

● Replicas are not always identical

● Replicas may become stale → version numbers

● Master stores current version in operation log

● Chunk servers register with master on start up

SS 2011 Distributed OS / File System Scalability - Hermann Härtig 14

Fault Tolerance

● Little on fault handling in SOSP slides

→ read the SOSP paper and prepare to work on it in the

exercises!!!

SS 2011 Distributed OS / File System Scalability - Hermann Härtig 15

More Cluster File Systems

● Lustre for HPC

● G(lobal-)FS

● PVFS

● Tidy FS (to appear June 2011, Usenix ATC)

● Hadoop FS

SS 2011 Distributed OS / File System Scalability - Hermann Härtig 16

References

1) NFS/AFS: many text books

2) Google FS:

The Google File System

Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung

Symposium on OS Principles 2003