Data-Intensive Distributed Computingcs451/slides/big-data-part02b.pdf · HDFS is not like a typical...

36
1 Data-Intensive Distributed Computing Part 2: MapReduce Algorithm Design (2/3) 431/451/631/651 (Winter 2021) Ali Abedi These slides are available at https://www.student.cs.uwaterloo.ca/~cs451/ 1

Transcript of Data-Intensive Distributed Computingcs451/slides/big-data-part02b.pdf · HDFS is not like a typical...

Page 1: Data-Intensive Distributed Computingcs451/slides/big-data-part02b.pdf · HDFS is not like a typical file system you use on Windows or Linux. It was ... Google File System (GFS) 31

1

Data-Intensive Distributed Computing

Part 2: MapReduce Algorithm Design (2/3)

431/451/631/651 (Winter 2021)

Ali Abedi

These slides are available at https://www.student.cs.uwaterloo.ca/~cs451/1

Page 2: Data-Intensive Distributed Computingcs451/slides/big-data-part02b.pdf · HDFS is not like a typical file system you use on Windows or Linux. It was ... Google File System (GFS) 31

Although we argued about having an abstraction layer to hide the complexities of

underlying infrastructure, today we want to have a quick look at the architecture of

datacenters. This will help us later to understand the performance trade offs

different algorithms. It also makes us appreciate these systems more ☺

2

Abstraction

Cluster of computers

Storage/computing

HDFS MapReduce blissful ignorance

unpleasant truth

2

Page 3: Data-Intensive Distributed Computingcs451/slides/big-data-part02b.pdf · HDFS is not like a typical file system you use on Windows or Linux. It was ... Google File System (GFS) 31

3

A quick review of data center architecture

3

Page 4: Data-Intensive Distributed Computingcs451/slides/big-data-part02b.pdf · HDFS is not like a typical file system you use on Windows or Linux. It was ... Google File System (GFS) 31

Left: Top view of a server

Right: the two top figures are the front of the server with two storage configurations:

1)16 2.5 inch drives 2) 8 3.5 inch drivers

Right: bottom is the back of the server. We can see network interfaces (7)

4

The anatomy of a server

4

Page 5: Data-Intensive Distributed Computingcs451/slides/big-data-part02b.pdf · HDFS is not like a typical file system you use on Windows or Linux. It was ... Google File System (GFS) 31

We put multiple servers in a server rack. There is a network switch that connects

the servers in a rack. This switch also connects the rack to other racks.

5

The anatomy of a server rack

5

Page 6: Data-Intensive Distributed Computingcs451/slides/big-data-part02b.pdf · HDFS is not like a typical file system you use on Windows or Linux. It was ... Google File System (GFS) 31

Clusters of racks of servers build a data center. This is a very simplistic view of a

data center.

6

The anatomy of a data center

6

Page 7: Data-Intensive Distributed Computingcs451/slides/big-data-part02b.pdf · HDFS is not like a typical file system you use on Windows or Linux. It was ... Google File System (GFS) 31

Capacity, latency, and bandwidth for reading data change depending on where the

data is.

The lowest latency and highest bandwidth is achieved when the data we need is on

our local server.

We can increase capacity by utilizing other servers but at the cost of higher latency

and lower bandwidth.

7

Storage Hierarchy

Local MachineL1/L2/L3 cache, memory, SSD, magnetic disks

capacity, latency, bandwidth

Remote MachineSame Rack

Remote MachineDifferent Rack

Remote MachineDifferent Datacenter

7

Page 8: Data-Intensive Distributed Computingcs451/slides/big-data-part02b.pdf · HDFS is not like a typical file system you use on Windows or Linux. It was ... Google File System (GFS) 31

https://colin-scott.github.io/personal_website/research/interactive_latency.html

8

Latency numbers every programmer should knowDemo

8

Page 9: Data-Intensive Distributed Computingcs451/slides/big-data-part02b.pdf · HDFS is not like a typical file system you use on Windows or Linux. It was ... Google File System (GFS) 31

https://youtu.be/XZmGGAbHqa0

9

The anatomy of a data centerGoogle’s data center video

9

Page 10: Data-Intensive Distributed Computingcs451/slides/big-data-part02b.pdf · HDFS is not like a typical file system you use on Windows or Linux. It was ... Google File System (GFS) 31

10

Abstraction

Cluster of computers

Storage/computing

10

Page 11: Data-Intensive Distributed Computingcs451/slides/big-data-part02b.pdf · HDFS is not like a typical file system you use on Windows or Linux. It was ... Google File System (GFS) 31

11

Distributed File SystemHow can we store a large file on a distributed system?

11

Page 12: Data-Intensive Distributed Computingcs451/slides/big-data-part02b.pdf · HDFS is not like a typical file system you use on Windows or Linux. It was ... Google File System (GFS) 31

Assume that we have 20 identical networked servers each with 100 TB of disk

space. How would you store a file on these server? This is the fundamental

question in distributed file systems.

12

. . .

100 TB 100 TB 100 TB 100 TB 100 TB

S1 S2 S3 S19 S20

200 TB

File.txt

How do you store this file?

12

Page 13: Data-Intensive Distributed Computingcs451/slides/big-data-part02b.pdf · HDFS is not like a typical file system you use on Windows or Linux. It was ... Google File System (GFS) 31

We can split the file into smaller chunks.

13

. . .

100 TB 100 TB 100 TB 100 TB 100 TB

S1 S2 S3 S19 S20

File.txt

Divide into smaller chunks

13

Page 14: Data-Intensive Distributed Computingcs451/slides/big-data-part02b.pdf · HDFS is not like a typical file system you use on Windows or Linux. It was ... Google File System (GFS) 31

And assign the chunks (e.g., randomly) to the servers.

14

. . .

100 TB 100 TB 100 TB 100 TB 100 TB

S1 S2 S3 S19 S20

1

File.txt

2 3 4 5 6 7 8 Assign chunks to servers

14

Page 15: Data-Intensive Distributed Computingcs451/slides/big-data-part02b.pdf · HDFS is not like a typical file system you use on Windows or Linux. It was ... Google File System (GFS) 31

We need to track where each chunk is stored so that we can retrieve the file.

15

1 → S12 → S3

…8 → S19

. . .

100 TB 100 TB 100 TB 100 TB 100 TB

S1 S2 S3 S19 S20

File.txt

Keep track of the chunks

using a master server

15

Page 16: Data-Intensive Distributed Computingcs451/slides/big-data-part02b.pdf · HDFS is not like a typical file system you use on Windows or Linux. It was ... Google File System (GFS) 31

If a server that contains one of the chunks fails, the files become corrupted. Since

failure rate is high on commodity servers, we need to figure out a solution.

16

1 → S12 → S3

…8 → S19

. . .

100 TB 100 TB 100 TB 100 TB 100 TB

S1 S2 S3 S19 S20

File.txt

What happens when a server fails?!

16

Page 17: Data-Intensive Distributed Computingcs451/slides/big-data-part02b.pdf · HDFS is not like a typical file system you use on Windows or Linux. It was ... Google File System (GFS) 31

If each chunk is stored on multiple server, if a server fails there is a backup. The

number of copies determines how much resilience we want.

17

. . .

100 TB 100 TB 100 TB 100 TB 100 TB

S1 S2 S3 S19 S20

1

File.txt

2 3 4 5 6 7 8 FAULT TOLORANCEStore each chunk on

multiple servers

REPLICATION

17

Page 18: Data-Intensive Distributed Computingcs451/slides/big-data-part02b.pdf · HDFS is not like a typical file system you use on Windows or Linux. It was ... Google File System (GFS) 31

18

From our made-up distributed file system to a real one

18

Page 19: Data-Intensive Distributed Computingcs451/slides/big-data-part02b.pdf · HDFS is not like a typical file system you use on Windows or Linux. It was ... Google File System (GFS) 31

19

Hadoop Distributed File System (HDFS)

Adapted from form Erik Jonsson (UT Dallas) 19

Page 20: Data-Intensive Distributed Computingcs451/slides/big-data-part02b.pdf · HDFS is not like a typical file system you use on Windows or Linux. It was ... Google File System (GFS) 31

20

Goals of HDFS

• Very Large Distributed File System• 10K nodes, 100 million files, 10PB

• Assumes Commodity Hardware• Files are replicated to handle hardware failure

• Detect failures and recover from them

• Optimized for Batch Processing• Provides very high aggregate bandwidth

20

Page 21: Data-Intensive Distributed Computingcs451/slides/big-data-part02b.pdf · HDFS is not like a typical file system you use on Windows or Linux. It was ... Google File System (GFS) 31

HDFS is not like a typical file system you use on Windows or Linux. It was

specifically designed for Hadoop. It cannot perform some of the typical operations

that other file systems can do like random write. Instead it is optimized for large

sequential reads and append only writes.

21

Distributed File System

• Data Coherency• Write-once-read-many access model

• Client can only append to existing files

• Files are broken up into blocks• Typically 64MB block size

• Each block replicated on multiple DataNodes

• Intelligent Client• Client can find location of blocks

• Client accesses data directly from DataNode

21

Page 22: Data-Intensive Distributed Computingcs451/slides/big-data-part02b.pdf · HDFS is not like a typical file system you use on Windows or Linux. It was ... Google File System (GFS) 31

Note that the namenode is relatively lightweight, it's just storing where the data is

located on datanodes not the actual data.

May still have a redundant namenode in the background if the primary one fails

HDFS client gets data information from namenode and then interacts with

datanodes to get that data

Note that namenode has to communicate with datanodes to ensure consistency and

redundancy of data (e.g., if a new clone of the data needs to be created)

22

Adapted from (Ghemawat et al., SOSP 2003)

(file name, block id)

(block id, block location)

instructions to datanode

datanode state(block id, byte range)

block data

HDFS namenode

HDFS datanode

Linux file system

HDFS datanode

Linux file system

File namespace

/foo/bar

block 3df2

Application

HDFS Client

HDFS Architecture

22

Page 23: Data-Intensive Distributed Computingcs451/slides/big-data-part02b.pdf · HDFS is not like a typical file system you use on Windows or Linux. It was ... Google File System (GFS) 31

23

Functions of a NameNode

• Manages File System Namespace• Maps a file name to a set of blocks

• Maps a block to the DataNodes where it resides

• Cluster Configuration Management

• Replication Engine for Blocks

23

Page 24: Data-Intensive Distributed Computingcs451/slides/big-data-part02b.pdf · HDFS is not like a typical file system you use on Windows or Linux. It was ... Google File System (GFS) 31

24

NameNode Metadata

• Metadata in Memory• The entire metadata is in main memory

• No demand paging of metadata

• Types of metadata• List of files

• List of Blocks for each file

• List of DataNodes for each block

• File attributes, e.g. creation time, replication factor

• A Transaction Log• Records file creations, file deletions etc

24

Page 25: Data-Intensive Distributed Computingcs451/slides/big-data-part02b.pdf · HDFS is not like a typical file system you use on Windows or Linux. It was ... Google File System (GFS) 31

25

DataNode

• A Block Server• Stores data in the local file system (e.g. ext3)

• Stores metadata of a block (e.g. CRC)

• Serves data and metadata to Clients

• Block Report• Periodically sends a report of all existing blocks to the NameNode

• Facilitates Pipelining of Data• Forwards data to other specified DataNodes

25

Page 26: Data-Intensive Distributed Computingcs451/slides/big-data-part02b.pdf · HDFS is not like a typical file system you use on Windows or Linux. It was ... Google File System (GFS) 31

26

Block Placement

• Current Strategy• One replica on local node

• Second replica on a remote rack

• Third replica on same remote rack

• Additional replicas are randomly placed

• Clients read from nearest replicas

26

Page 27: Data-Intensive Distributed Computingcs451/slides/big-data-part02b.pdf · HDFS is not like a typical file system you use on Windows or Linux. It was ... Google File System (GFS) 31

27

Heartbeats

• DataNodes send hearbeat to the NameNode• Once every 3 seconds

• NameNode uses heartbeats to detect DataNode failure

27

Page 28: Data-Intensive Distributed Computingcs451/slides/big-data-part02b.pdf · HDFS is not like a typical file system you use on Windows or Linux. It was ... Google File System (GFS) 31

28

Replication Engine

• NameNode detects DataNode failures• Chooses new DataNodes for new replicas

• Balances disk usage

• Balances communication traffic to DataNodes

28

Page 29: Data-Intensive Distributed Computingcs451/slides/big-data-part02b.pdf · HDFS is not like a typical file system you use on Windows or Linux. It was ... Google File System (GFS) 31

29

HDFS Demo

29

Page 30: Data-Intensive Distributed Computingcs451/slides/big-data-part02b.pdf · HDFS is not like a typical file system you use on Windows or Linux. It was ... Google File System (GFS) 31

30

Terminology differences:GFS master = Hadoop namenode

GFS chunkservers = Hadoop datanodes

Implementation differences:Different consistency model for file appends

Implementation languagePerformance

Google File System (GFS)

30

Page 31: Data-Intensive Distributed Computingcs451/slides/big-data-part02b.pdf · HDFS is not like a typical file system you use on Windows or Linux. It was ... Google File System (GFS) 31

31

Abstraction

Cluster of computers

Storage/computing

HDFS MapReduce

31

Page 32: Data-Intensive Distributed Computingcs451/slides/big-data-part02b.pdf · HDFS is not like a typical file system you use on Windows or Linux. It was ... Google File System (GFS) 31

32

Hadoop Cluster Architecture

32

Page 33: Data-Intensive Distributed Computingcs451/slides/big-data-part02b.pdf · HDFS is not like a typical file system you use on Windows or Linux. It was ... Google File System (GFS) 31

SAN: Storage Area Network

33

How do we get data to the workers?Let’s consider a typical supercomputer…

Compute Nodes

SAN

33

Page 34: Data-Intensive Distributed Computingcs451/slides/big-data-part02b.pdf · HDFS is not like a typical file system you use on Windows or Linux. It was ... Google File System (GFS) 31

This makes sense for compute-intensive tasks as the computations (for some chunk

of data) are likely to take a long while even on such sophisticated hardware, so the

communication costs are greatly outweighed by the computation costs. For data-

intensive tasks, the computations (for some chunk of data) aren’t likely to take

nearly as long, so the computation costs are greatly outweighed by the

communication costs. Likely to experience latency and bottleneck even with high

speed transfer.

34

Compute-Intensive vs. Data-Intensive

Why does this make sense for compute-intensive tasks?What’s the issue for data-intensive tasks?

Compute Nodes

SAN

34

Page 35: Data-Intensive Distributed Computingcs451/slides/big-data-part02b.pdf · HDFS is not like a typical file system you use on Windows or Linux. It was ... Google File System (GFS) 31

If a server is responsible for both data storage and processing, Hadoop can do a lot

of optimization. For example, when assigning mapreduce tasks to servers, Hadoop

considers which servers contain what part of the file locally to minimize copy over

network. If all of the data can be process locally where it is stored there will be no

need to move the data.

35

What’s the solution?Don’t move data to workers… move workers to the data!

Key idea: co-locate storage and computeStart up worker on nodes that hold the data

35

Page 36: Data-Intensive Distributed Computingcs451/slides/big-data-part02b.pdf · HDFS is not like a typical file system you use on Windows or Linux. It was ... Google File System (GFS) 31

This figure shows how computation and storage is co-located on a Hadoop cluster.

Node manager manages running tasks on a node (e.g., if we have spare resources,

do the next job assigned to us)

Resource manager is responsible for managing available resources in the cluster

36

DataNode

Linux file system

Node Manager

worker node

DataNode

Linux file system

Node Manager

worker node

DataNode

Linux file system

Node Manager

worker node

NameNode Resource Manager

Putting everything together…

36