NameNode and DataNode Couplingfor a Power-proportional Hadoop Distributed File System

NAMENODE AND DATANODE COUPLING

FOR A POWER-PROPORTIONAL

HADOOP DISTRIBUTED FILE SYSTEM

Hieu Hanh Le, Satoshi Hikida and Haruo Yokota

Tokyo Institute of Technology

Appeared in DASFAA 2013

The 18th International Conference on Database Systems for Advanced Applications (Wuhan, China)

Background

Research Motivation

Goal and Approach

Proposals

Experimental Evaluation

Conclusion

Agenda 2

Background

Hadoop Distributed File System (HDFS) is widely

used as data storage for applications in the Cloud

Commercial Off-the-self-based system

Support MapReduce framework

Good scalability

Utilize a huge number of DataNodes to store huge amount

of data requested by data-intensive applications

Expand the power consumption of storage system

Power-aware file systems are moving towards

power-proportional design

[Background]

Power-proportional Storage System

System should consume energy in proportion to

amount of work performed [Barroso and Holzle, 2007]

Set system’s operation to multiple gears containing

different number of data nodes

Made possible by data placement methods

High Gear

Low Gear

D2 D3 D1 D4

D1 D4 migration

Research Motivation 5

Gear-shifting is vital in power-proportional system

The system needs to reflect updated data that was

modified in a lower gear to guarantee the higher

performance

Re-transfer the updated data according to the data

placement

The inefficient gear-shifting process in current methods

for the HDFS [Rabbit, Sierra]

Bottleneck in metadata access

High communication cost among nodes

Rabbit: Robust and Flexible Power-proportional Storage, ACM SOCC 2010

Sierra: Practical Power-proportionality for Data Center Storage, ACM EuroSys 2011

Gear-shifting in current HDFS-based methods [1/10]

Write Dataset

D = {D1, D2, D3, D4}

Gear Up

Eg: Rabbit, Sierra

Low Gear High Gear

Write Dataset

D = {D1, D2, D3, D4}

Gear Up

Eg: Rabbit, Sierra 1. Access metadata to

identify updated blocks

Congestion

Low Gear High Gear

Write Dataset

D = {D1, D2, D3, D4}

Gear Up 2. Transfer updated

blocks

Eg: Rabbit, Sierra Congestion

2.1 Command

issuance 2.2 Transfer

Low Gear High Gear

1. Access metadata to

Write Dataset

D = {D1, D2, D3, D4}

blocks

2.1 Command

Low Gear High Gear

Write Dataset

D = {D1, D2, D3, D4}

blocks

2.1 Command

Low Gear High Gear

Write Dataset

D = {D1, D2, D3, D4}

blocks

2.1 Command

Low Gear High Gear

Write Dataset

D = {D1, D2, D3, D4}

blocks

2.1 Command

Low Gear High Gear

Write Dataset

D = {D1, D2, D3, D4}

blocks

2.1 Command

Low Gear High Gear

Write Dataset

D = {D1, D2, D3, D4}

blocks

D1 D4 D1

2.1 Command

Low Gear High Gear

Write Dataset

D = {D1, D2, D3, D4}

blocks

Eg: Rabbit, Sierra

Sequentially

(1 block/connection)

Congestion

Inefficiency D1

D1 D4 D1 D4

2.1 Command

Low Gear High Gear

Goal and Approach

Propose a novel architecture for efficient gear-shifting for power-proportional HDFS

Approach

Utilize distributed metadata management (MDM) Eliminate the bottleneck of the centralized MDM

Coupling NameNode and DataNode (NDCouplingHDFS) Localize the range of updated blocks maintained by metadata

management Reduce the communication cost among nodes

Enable multiple blocks transfer to improve the efficiency in HDFS

[Proposals]

Distributed MDM

Distribute MDM to multiple nodes to decentralize the load during gear-shiftings

Require a distributed MDM that is update conscious

The MDM is transferred when the system shifts gears

Low cost of search/insert/delete operations

Inefficient distributed hash table based method

For each transferred file, the hash function is needed to be applied

Efficient range based method

For a range of files, all the metadata can be transferred within a limited structure transverses

Apply two range-based methods

Each node statically maintains a separate subnamespace (Static Directory Partition-SDP)

Parallel index technique with well concurrency control (Fat-Btree) [*]

[*] A Concurrency Control Protocol for Parallel B-tree structure without

latch-coupling for explosively growing digital content, EDBT 2008

[Proposals]

NDCouplingHDFS with Distributed MDM

Each node maintains a subnamespace of the whole namspace of the system

The mapping information [Node, Range] is managed by Distributed MDM

Management

Distributed

Management

Distributed

Management

Distributed

Management

2. Forward request to

responsible nodes

3. Serve the request

and return the results

1. Send

request of 25

4. Return results

A NDCoulingHDFS

ND1: [1, 10]

ND2: [11,20]

ND3: [21, 30]

ND4: [31,~]

[Proposals]

Efficient Gear-shifting [1/6] 19

Management

Distributed

Management

Distributed

Management

Distributed

Management

A D B C

A1 B1 C1 D1

<File, Temp Node, Intended Node>

Reactivated Reactivated A1

D1 A1 D1

The process is distributed to multiple nodes

The command issuance from Disitributed MDM and Data Management is locally performed

Updated blocks are transferred in batch way (multiple blocks per connection)

[Proposals]

Management

Distributed

Management

Distributed

Management

Distributed

Management

A D B C

A1 B1 C1 D1

D1 A1 D1

1. Transfer updated

metadata

1. Transfer updated

metadata

[Proposals]

Management

Distributed

Management

Distributed

Management

Distributed

Management

A D B C

A1 B1 C1 D1

D1 A1 D1

1. Transfer updated

metadata

1. Transfer updated

metadata

2. Command issuance 2. Command issuance

[Proposals]

Management

Distributed

Management

Distributed

Management

Distributed

Management

A D B C

A1 B1 C1 D1

D1 A1 D1

1. Transfer updated

metadata

1. Transfer updated

metadata

3. Transfer blocks 3. Transfer blocks

[Proposals]

Management

Distributed

Management

Distributed

Management

Distributed

Management

A D B C

A1 B1 C1 D1

D1 A1 D1

1. Transfer updated

metadata

1. Transfer updated

metadata

4. Updated metadata 4. Updated metadata

[Proposals]

Management

Distributed

Management

Distributed

Management

Distributed

Management

A D B C

A1 B1 C1 D1

D1 A1 D1

1. Transfer updated

metadata

1. Transfer updated

metadata

4. Updated metadata 4. Updated metadata

Parallelism

Reduce

network cost

Efficient block

transfer

Experiment Evaluation

Experiment 1

Verify the effectiveness of proposals in gear-shifting

process by comparing with the normal HDFS

Updated block reflection is the major cost

Coupling architecture, batch block transferring

Experiment 2

Evaluate the effectiveness of distributed index

technique to NDCouplingHDFS

SDP and Fat-Btree through changing the number of nodes

[Experiment 1]

Validity of NDCouplingHDFS in Gear-shifting 26

Updated Data Reflection

# Gears 2

# Active nodes at Low Gear 8

# Active nodes at High

# files 16000

File size 1MB

Version 0.20.2

Maximum number of

transferred blocks

Heartbeat interval 1s

Compare the execution time of updated data

reflection the NDCouplingHDFS with the normal

HDFS based on five configurations

Combinations of architecture, distributed MDM (SDP,

Fat-Btree), command issuance, block transfer

Environment

NormalHDFS SSS SBS SBB FBB

Execution time

Number of communication connections[commnand issuance]

[Experiment 1]

Experimental Results 27

46% 41%

Configuration Normal

SSS SBS SBB FBB

Architecture HDFS Coupling Coupling Coupling Coupling

MDM Central SDP SDP SDP Fat-Btree

Command

issuance

Sequential Sequential Batch

transference

Sequential

Sequential Batch Batch

Coupling architecture and

Batch block transferring highly

effected the performance

[Experiment 2]

Scalability of metadata operations

Evaluate SDP vs. Fat-Btree

Change the number of files and number of nodes

Machine

# 1, 2, 4, 8

CPU TM8600 1.0GHz

Memory DRAM 4GB

NIC 1000Mb/s

OS Linux 3.0 64bit

Java JDK-1.7.0

Fat-Btree

Fanout 16

Control

Concurrency

LCFB [Yoshihara, 2007]

Workload

#files 3000

File size 1MB

Fat-Btree gained better scalability when the number of nodes increases

The read throughput scaled well due to better search cost and concurrency control

The efficiency in write throughput is limited due to the synchronization cost in updating tree structure

[Experiment 2]

Experimental Results 29

1 2 4 8

Fat-Btree

1 2 4 8

Read T

[opera

A transaction: open/create metadata

and read/write files

Conclusion

Proposed NDCouplingHDFS for efficient gear-shifting in power-proportional HDFS

Significantly reduced at most 46% the execution time of reflecting updated data compare with the normal HDFS

Coupling architecture and batch block transferring

Improved the IO performance by applying distributed index technique to NDCouplingHDFS

NDCouplingHDFS

Maintains supporting MapReduce

Exptected to achieve real power-proportionality including power consumption of metadata management

NameNode and DataNode Coupling for a

Power-proportional Hadoop Distributed File System

Thank you for your attention! 31

NameNode and DataNode Couplingfor a Power-proportional Hadoop Distributed File System

Technology

Transcript of NameNode and DataNode Couplingfor a Power-proportional Hadoop Distributed File System

Apache Hadoop YARN, NameNode HA, HDFS Federation

Troubleshooting Bosch Proportional Valves · PDF fileTroubleshooting Bosch Proportional Valves Webinar

Topic A: Proportional Relationships Lesson 2 Proportional Relationships.

Cartridge Valves Technical Information Proportional Valves ... · PDF fileCartridge Valves Technical Information Proportional Valves ... Cartridge Valves Technical Information Proportional

Vickers Proportional Valves Proportional Directional and ...

Proportional Technologies - IMI Precision Engineering€¦ · 02 Proportional technologies Proportional technologies 03 03 Proportional Expertise 04 Process Pressure Control 05-09

Proportional Gain

PROPORTIONAL AND NON-PROPORTIONAL TRANSFER OF MOVEMENT ...

1 ELECTRO – PROPORTIONAL VALVES. 2 HYDRAULIC CONTROL TECHNOLOGY Closed Loop Proportional Increasing Performance Servo valve Hybrid series Proportional.

1. Explain the Globus Toolkit Architecture (GT4) · • NameNode handles management of the file system metadata, and provides management and control services. • DataNode provides

Proportional vs. Non-proportional

Proportional Relationships

Density. Vocabulary Proportional Inversely proportional Ratio.

DataManagementPart1VT2018 - ida.liu.se732A54/timetable/HDFS.pdf · HDFS — High Availability The namenode is single point of failure: — If a namenode crashes the cluster is down

Directional Proportional Control Valve CV3000 · Directional Proportional Control Valve CV3000

Hadoop File Systemheng/teaching/cs202-sp18/lec15.pdfHDFS client caches the data into a temporary file. When the data reached a HDFS block size the client contacts the Namenode. Namenode

Large Scale File Systems - GitHub Pages · GFS vs. HDFS GFS HDFS Master Namenode Chunkserver DataNode Operation Log Journal, Edit Log Chunk Block Random le writes possible Only append

The Freeze-Frame File System - Cornell University · internals. HDFS has two subsystems: the NameNode, which implements the ﬁle system name-space, and a collection of DataNode servers.

Proportional Valve Group PVG 128 and 256 Technical …files.danfoss.com/documents/Proportional Valve Group PVG 128 and... · Proportional Valve Group PVG 128/256 ... 128/256 proportional

Topic A: Proportional Relationships Lesson 3 Identifying Proportional and Non-Proportional Relationships in Table.