Distributed File System By Manshu Zhang. Outline Basic Concepts Current project Hadoop Distributed...

20
Distributed File System By Manshu Zhang

Transcript of Distributed File System By Manshu Zhang. Outline Basic Concepts Current project Hadoop Distributed...

Page 1: Distributed File System By Manshu Zhang. Outline Basic Concepts Current project Hadoop Distributed File System Future work Reference.

Distributed File System

By Manshu Zhang

Page 2: Distributed File System By Manshu Zhang. Outline Basic Concepts Current project Hadoop Distributed File System Future work Reference.

Outline

Basic Concepts Current project

Hadoop Distributed File System

Future work Reference

Page 3: Distributed File System By Manshu Zhang. Outline Basic Concepts Current project Hadoop Distributed File System Future work Reference.

DFS

A distributed implementation of the classical time sharing model of a file system, where multiple users share files and storage resources.

Page 4: Distributed File System By Manshu Zhang. Outline Basic Concepts Current project Hadoop Distributed File System Future work Reference.

Key Characteristics of DFS

Dispersion

Clients and files

Multiplicity

Clients and files

Page 5: Distributed File System By Manshu Zhang. Outline Basic Concepts Current project Hadoop Distributed File System Future work Reference.

Primary issues of DFS

Naming and Transparency

Fault Tolerance

Page 6: Distributed File System By Manshu Zhang. Outline Basic Concepts Current project Hadoop Distributed File System Future work Reference.

Naming

Naming – mapping between logical and physical objects.

Multilevel mapping. Transparent replicas and location

Page 7: Distributed File System By Manshu Zhang. Outline Basic Concepts Current project Hadoop Distributed File System Future work Reference.

Naming Schemes — Three Main Approaches

Host name + local name guarantees a unique system wide name.

Mount remote directories to local directories once mounted, files can be referenced in a location-transparent

manner

Total integration of the component file systems. A single global name structure If a server is unavailable, some arbitrary set of directories on on

different machines also becomes unavailable

Page 8: Distributed File System By Manshu Zhang. Outline Basic Concepts Current project Hadoop Distributed File System Future work Reference.

Transparency(1)

Login Transparency: User can log in at any host with

uniform login procedure and perceive a uniform view of

the file system. Access Transparency: Client process on a hots has

uniform mechanism to access all files in system regardeless of files are on local/remote host.

Location Transparency: The names of the files do not

reveal their physical location.

Page 9: Distributed File System By Manshu Zhang. Outline Basic Concepts Current project Hadoop Distributed File System Future work Reference.

Transparency(2)

Concurrency Transparency: An update to a file should not have effect on the correct execution of other process that is concurrently sharing a file.

Replication Transparency: Files may be replicated to provide redundancy for availability and also to permit concurrent access for efficiency.

Page 10: Distributed File System By Manshu Zhang. Outline Basic Concepts Current project Hadoop Distributed File System Future work Reference.

Fault Tolerance

Stateful Vs. Stateless Maintain information on client

File Replication

Page 11: Distributed File System By Manshu Zhang. Outline Basic Concepts Current project Hadoop Distributed File System Future work Reference.

Distinctions Between Stateful &Stateless Service

Failure Recovery. A stateful server loses all its volatile state in a crash. With stateless server, the effects of server failure and

recovery are almost unnoticeable.

Page 12: Distributed File System By Manshu Zhang. Outline Basic Concepts Current project Hadoop Distributed File System Future work Reference.

File Replication

Several copies of a file's contents at different

locations enable multiple servers to share the

load of providing the service

Naming scheme maps a replicated file name

to a particular replica.

Updates

Page 13: Distributed File System By Manshu Zhang. Outline Basic Concepts Current project Hadoop Distributed File System Future work Reference.

Current Project

HDFS: Hadoop Distributed File System

Distributed parallel fault tolerant file system. It is

designed to reliably store very large files across

machines in a large cluster.

Efficient, reliable, and open source

Page 14: Distributed File System By Manshu Zhang. Outline Basic Concepts Current project Hadoop Distributed File System Future work Reference.
Page 15: Distributed File System By Manshu Zhang. Outline Basic Concepts Current project Hadoop Distributed File System Future work Reference.

Naming: central metadata server

Synchronization: write-once-read-many, give

locks on objects to clients, using leases

Consistency and replication: server side

replication, asynchronous replication, checksum

Fault tolerance: failure as norm

Security: no dedicated security mechanism

Page 16: Distributed File System By Manshu Zhang. Outline Basic Concepts Current project Hadoop Distributed File System Future work Reference.

Future Work Robustness of data sharing model

The preceding section, architecture, naming,

synchronization, availability, heterogeneity and support

for databases

Security

Page 17: Distributed File System By Manshu Zhang. Outline Basic Concepts Current project Hadoop Distributed File System Future work Reference.

Reference

[1] Thanh, T.D.; Mohan, S.; Choi, E.; SangBum Kim; Pilsung Kim.

2008Networked Computing and Advanced Information Management. “A

Taxonomy and Survey on Distributed File Systems”

[2] Randy chow,1997,Distributed operating systems & Algorithms

[3] Eliezer Levy, Abraham Silberschatz. December 1990 Computing

Surveys (CSUR) , Volume 22 Issue 4. ”Distributed file systems: concepts

and examples”.

[4]http://hadoop.apache.org/common/docs/current/

hdfs_design.html#Introduction

[5]http://www.snia.org/events/wintersymp2009/cloud/

dhruba_hadoop_snia.pdf

Page 18: Distributed File System By Manshu Zhang. Outline Basic Concepts Current project Hadoop Distributed File System Future work Reference.

[6]http://en.wikipedia.org/wiki/List_of_file_systems#Distributed_file_systems

[7]http://en.wikipedia.org/wiki/Hadoop#Hadoop_Distributed_File_System

[8]http://www.cs.gsu.edu/~cscyqz/courses/aos/slides08/ch6.1-Fall08.pptx

Page 19: Distributed File System By Manshu Zhang. Outline Basic Concepts Current project Hadoop Distributed File System Future work Reference.

Q&A?

Page 20: Distributed File System By Manshu Zhang. Outline Basic Concepts Current project Hadoop Distributed File System Future work Reference.

Thank you!