Simple introduction to HDFS Jie Wu. Some Useful Features –File permissions and authentication....
-
Upload
christian-simpson -
Category
Documents
-
view
212 -
download
0
Transcript of Simple introduction to HDFS Jie Wu. Some Useful Features –File permissions and authentication....
Simple introduction to HDFS
Jie Wu
Some Useful Features
– File permissions and authentication. – Rack awareness: to take a node's physical location into account
while scheduling tasks and allocating storage. – Safemode: an administrative mode for maintenance. – fsck: a utility to diagnose health of the file system, to find missing
files or blocks. – Rebalancer: tool to balance the cluster when the data is unevenl
y distributed among DataNodes. – Upgrade and rollback: after a software upgrade, it is possible to r
ollback to HDFS' state before the upgrade in case of unexpected problems.
– Secondary NameNode: performs periodic checkpoints of the namespace and helps keep the size of file containing log of HDFS modifications within certain limits at the NameNode.
Goals of HDFS
• Hardware Failure: detection of faults and quick, automatic recovery
• Streaming Data Access: designed more for batch processing rather than interactive use by users
• Large Data Sets • Simple Coherency Model: write-once-read-many acces
s model, but there is a plan to support appending-writes to files in the future
• Moving Computation is Cheaper than Moving Data • Portability
Architecture
What It can Support
• Create, read, write (once), remove, copy and rename a file, no modification
• A very simple permission model like Linux, no user authentication like Kerberos and encryption of data transfers
• Directory quotes, no user quotes
• no hard links or soft links
• Recycle Bin