Simple introduction to HDFS Jie Wu. Some Useful Features –File permissions and authentication....

6
Simple introduction to HDFS Jie Wu

Transcript of Simple introduction to HDFS Jie Wu. Some Useful Features –File permissions and authentication....

Page 1: Simple introduction to HDFS Jie Wu. Some Useful Features –File permissions and authentication. –Rack awareness: to take a node's physical location into.

Simple introduction to HDFS

Jie Wu

Page 2: Simple introduction to HDFS Jie Wu. Some Useful Features –File permissions and authentication. –Rack awareness: to take a node's physical location into.

Some Useful Features

– File permissions and authentication. – Rack awareness: to take a node's physical location into account

while scheduling tasks and allocating storage. – Safemode: an administrative mode for maintenance. – fsck: a utility to diagnose health of the file system, to find missing

files or blocks. – Rebalancer: tool to balance the cluster when the data is unevenl

y distributed among DataNodes. – Upgrade and rollback: after a software upgrade, it is possible to r

ollback to HDFS' state before the upgrade in case of unexpected problems.

– Secondary NameNode: performs periodic checkpoints of the namespace and helps keep the size of file containing log of HDFS modifications within certain limits at the NameNode.

Page 3: Simple introduction to HDFS Jie Wu. Some Useful Features –File permissions and authentication. –Rack awareness: to take a node's physical location into.

Goals of HDFS

• Hardware Failure: detection of faults and quick, automatic recovery

• Streaming Data Access: designed more for batch processing rather than interactive use by users

• Large Data Sets • Simple Coherency Model: write-once-read-many acces

s model, but there is a plan to support appending-writes to files in the future

• Moving Computation is Cheaper than Moving Data • Portability

Page 4: Simple introduction to HDFS Jie Wu. Some Useful Features –File permissions and authentication. –Rack awareness: to take a node's physical location into.

Architecture

Page 5: Simple introduction to HDFS Jie Wu. Some Useful Features –File permissions and authentication. –Rack awareness: to take a node's physical location into.
Page 6: Simple introduction to HDFS Jie Wu. Some Useful Features –File permissions and authentication. –Rack awareness: to take a node's physical location into.

What It can Support

• Create, read, write (once), remove, copy and rename a file, no modification

• A very simple permission model like Linux, no user authentication like Kerberos and encryption of data transfers

• Directory quotes, no user quotes

• no hard links or soft links

• Recycle Bin