04/27/2011DHT1 Operating System ecs251 Spring 2011 : Operating System #6: Map and Reduce Dr. S....

52
04/27/2011 DHT 1 ecs251 Spring 2011: Operating System Operating System #6: Map and Reduce Dr. S. Felix Wu Computer Science Department University of California, Davis http://www.facebook.com/group.php? gid=29670204725 http://cyrus.cs.ucdavis.edu/~wu/ecs251

Transcript of 04/27/2011DHT1 Operating System ecs251 Spring 2011 : Operating System #6: Map and Reduce Dr. S....

Page 1: 04/27/2011DHT1 Operating System ecs251 Spring 2011 : Operating System #6: Map and Reduce Dr. S. Felix Wu Computer Science Department University of California,

04/27/2011 DHT 1

ecs251 Spring 2011:Operating SystemOperating System#6: Map and Reduce

Dr. S. Felix Wu

Computer Science Department

University of California, Davis

http://www.facebook.com/group.php?gid=29670204725

http://cyrus.cs.ucdavis.edu/~wu/ecs251

Page 2: 04/27/2011DHT1 Operating System ecs251 Spring 2011 : Operating System #6: Map and Reduce Dr. S. Felix Wu Computer Science Department University of California,

Programming ModelProgramming Model

Input-key\value pair Output- key\value pair MapReduce Library contains 2 functions:

Map Reduce

Input key\value pair Intermediate key\value pair MapReduce library groups all intermediate values with the

same intermediate key I Intermediate key I Smaller set of values and values for I

MAP

REDUCE

2

Page 3: 04/27/2011DHT1 Operating System ecs251 Spring 2011 : Operating System #6: Map and Reduce Dr. S. Felix Wu Computer Science Department University of California,

MapReduce : ExampleMapReduce : Example

Counting number of occurrences of each word in a large collection of documents.

doc name & doc contents word & its

occurrences word & list of counts sum of all counts for word Input and output types:

map(k1,v1) list(k2,v2)

reduce(k2,list(v2)) list(v2)

MAP

REDUCE

3

Page 4: 04/27/2011DHT1 Operating System ecs251 Spring 2011 : Operating System #6: Map and Reduce Dr. S. Felix Wu Computer Science Department University of California,

MapReduce : ExecutionMapReduce : Execution

4

Page 5: 04/27/2011DHT1 Operating System ecs251 Spring 2011 : Operating System #6: Map and Reduce Dr. S. Felix Wu Computer Science Department University of California,

04/27/2011 DHT 5

Page 6: 04/27/2011DHT1 Operating System ecs251 Spring 2011 : Operating System #6: Map and Reduce Dr. S. Felix Wu Computer Science Department University of California,

04/27/2011 DHT 6

SecondaryNameNode

Client

HDFS Architecture

NameNode

DataNodes

1. filename

2. BlckId, DataNodes

o

3.Read data

Cluster Membership

Cluster Membership

NameNode : Maps a file to a file-id and list of MapNodesDataNode : Maps a block-id to a physical location on diskSecondaryNameNode: Periodic merge of Transaction log

Page 7: 04/27/2011DHT1 Operating System ecs251 Spring 2011 : Operating System #6: Map and Reduce Dr. S. Felix Wu Computer Science Department University of California,

04/27/2011 DHT 7

Map and ReduceMap and Reduce

The idea of Map, and Reduce is 40+ year old– Present in all Functional Programming Languages.

– See, e.g., APL, Lisp and ML

Alternate names for Map: Apply-All Higher Order Functions

– take function definitions as arguments, or

– return a function as output

Map and Reduce are higher-order functions.

Page 8: 04/27/2011DHT1 Operating System ecs251 Spring 2011 : Operating System #6: Map and Reduce Dr. S. Felix Wu Computer Science Department University of California,

04/27/2011 DHT 8

Page 9: 04/27/2011DHT1 Operating System ecs251 Spring 2011 : Operating System #6: Map and Reduce Dr. S. Felix Wu Computer Science Department University of California,

04/27/2011 DHT 9

GFS: Google File SystemGFS: Google File System

“failures” are norm Multiple-GB files are common Append rather than overwrite

– Random writes are rare Can we relax the consistency?

Page 10: 04/27/2011DHT1 Operating System ecs251 Spring 2011 : Operating System #6: Map and Reduce Dr. S. Felix Wu Computer Science Department University of California,

04/27/2011 DHT 10

Page 11: 04/27/2011DHT1 Operating System ecs251 Spring 2011 : Operating System #6: Map and Reduce Dr. S. Felix Wu Computer Science Department University of California,

04/27/2011 DHT 11

# an input reader# a Map function# a partition function# a compare function# a Reduce function# an output write

Page 12: 04/27/2011DHT1 Operating System ecs251 Spring 2011 : Operating System #6: Map and Reduce Dr. S. Felix Wu Computer Science Department University of California,

04/27/2011 DHT 12

Map: A Higher Order Map: A Higher Order FunctionFunction

F(x: int) returns r: int Let V be an array of integers. W = map(F, V)

– W[i] = F(V[i]) for all I– i.e., apply F to every element of V

Page 13: 04/27/2011DHT1 Operating System ecs251 Spring 2011 : Operating System #6: Map and Reduce Dr. S. Felix Wu Computer Science Department University of California,

04/27/2011 DHT 13

Map Examples in Map Examples in HaskellHaskell

map (+1) [1,2,3,4,5]== [2, 3, 4, 5, 6]

map (toLower) "abcDEFG12!@#“== "abcdefg12!@#“

map (`mod` 3) [1..10]== [1, 2, 0, 1, 2, 0, 1, 2, 0, 1]

Page 14: 04/27/2011DHT1 Operating System ecs251 Spring 2011 : Operating System #6: Map and Reduce Dr. S. Felix Wu Computer Science Department University of California,

04/27/2011 DHT 14

Word Count ExampleWord Count Example

Read text files and count how often words occur. – The input is text files– The output is a text file

each line: word, tab, count

Map: Produce pairs of (word, count) Reduce: For each word, sum up the counts.

Page 15: 04/27/2011DHT1 Operating System ecs251 Spring 2011 : Operating System #6: Map and Reduce Dr. S. Felix Wu Computer Science Department University of California,

04/27/2011 DHT 15

I am a tiger, you are also a tiger a,2 also,1 am,1 are,1 I,1 tiger,2 you,1

I,1 am,1 a,1

tiger,1 you,1 are,1

also,1 a, 1 tiger,1

a,2also,1am,1 are,1

I, 1 tiger,2 you,1

reduce

reduce

map

map

map

a, 1 a,1 also,1 am,1 are,1 I,1 tiger,1 tiger,1 you,1

Page 16: 04/27/2011DHT1 Operating System ecs251 Spring 2011 : Operating System #6: Map and Reduce Dr. S. Felix Wu Computer Science Department University of California,

04/27/2011 DHT 16

Grep ExampleGrep Example

Search input files for a given patternMap: emits a line if pattern is matchedReduce: Copies results to output

Page 17: 04/27/2011DHT1 Operating System ecs251 Spring 2011 : Operating System #6: Map and Reduce Dr. S. Felix Wu Computer Science Department University of California,

04/27/2011 DHT 17

Inverted Index ExampleInverted Index Example

Generate an inverted index of words from a given set of files

Map: parses a document and emits <word, docId> pairs

Reduce: takes all pairs for a given word, sorts the docId values, and emits a <word, list(docId)> pair

Page 18: 04/27/2011DHT1 Operating System ecs251 Spring 2011 : Operating System #6: Map and Reduce Dr. S. Felix Wu Computer Science Department University of California,

04/27/2011 DHT 18

Execution on ClustersExecution on Clusters

1. Input files split (M splits)

2. Assign Master & Workers

3. Map tasks

4. Writing intermediate data to disk (R regions)

5. Intermediate data read & sort

6. Reduce tasks

7. Return

Page 19: 04/27/2011DHT1 Operating System ecs251 Spring 2011 : Operating System #6: Map and Reduce Dr. S. Felix Wu Computer Science Department University of California,

04/27/2011 DHT 19

<Key, Value> Pair

19

Reduce

Input

Output

Row Data

key values

Map

Reduce

key1 valkey2 valkey1 val… …

Map

Input

Output

Select Key

key1 val val…. val

Page 20: 04/27/2011DHT1 Operating System ecs251 Spring 2011 : Operating System #6: Map and Reduce Dr. S. Felix Wu Computer Science Department University of California,

04/27/2011 DHT 20

split 0

split 1

split 2

split 3

split 4

part0

map

map

map

reduce

reduce part1

inputHDFS

sort/copymerge

outputHDFS

Page 21: 04/27/2011DHT1 Operating System ecs251 Spring 2011 : Operating System #6: Map and Reduce Dr. S. Felix Wu Computer Science Department University of California,

04/27/2011 DHT 21

Page 22: 04/27/2011DHT1 Operating System ecs251 Spring 2011 : Operating System #6: Map and Reduce Dr. S. Felix Wu Computer Science Department University of California,

04/27/2011 DHT 22

Page 23: 04/27/2011DHT1 Operating System ecs251 Spring 2011 : Operating System #6: Map and Reduce Dr. S. Felix Wu Computer Science Department University of California,

04/27/2011 DHT 23

Class MR{Class Mapper …{ }Class Reducer …{ }main(){

JobConf conf = new JobConf(“MR.class”);conf.setMapperClass(Mapper.class);conf.setReduceClass(Reducer.class);

FileInputFormat.setInputPaths(conf, new Path(args[0]));FileOutputFormat.setOutputPath(conf, new

Path(args[1]));

JobClient.runJob(conf);}}

Map function

Reduce function

Other parts of program

Map

Reduce

Config

Page 24: 04/27/2011DHT1 Operating System ecs251 Spring 2011 : Operating System #6: Map and Reduce Dr. S. Felix Wu Computer Science Department University of California,

04/27/2011 DHT 24 24

class MyMap extends MapReduceBase implements Mapper < , , , > {// global variables

public void map ( key, value, OutputCollector< , > output,

Reporter reporter) throws IOException {// local variables and programoutput.collect( NewKey, NewValue);}

}

1

234

56789

INPUT KEY

INPUT VALUE

OUTPUT VALUE

OUTPUT KEY

INPUT KEY

INPUT VALUE

OUTPUT VALUE

OUTPUT KEY

Page 25: 04/27/2011DHT1 Operating System ecs251 Spring 2011 : Operating System #6: Map and Reduce Dr. S. Felix Wu Computer Science Department University of California,

04/27/2011 DHT 25

class MyRed extends MapReduceBase implements Reducer < , , , > {// global variables

public void reduce ( key, Iterator< > values, OutputCollector< , > output,

Reporter reporter) throws IOException {// local variables and programoutput.collect( NewKey, NewValue);}

}

1

234

56789

INPUT KEY

INPUT VALUE

OUTPUT VALUE

OUTPUT KEY

INPUT KEY

INPUT VALUE

OUTPUT VALUE

OUTPUT KEY

Page 26: 04/27/2011DHT1 Operating System ecs251 Spring 2011 : Operating System #6: Map and Reduce Dr. S. Felix Wu Computer Science Department University of California,

04/27/2011 DHT 26

Page 27: 04/27/2011DHT1 Operating System ecs251 Spring 2011 : Operating System #6: Map and Reduce Dr. S. Felix Wu Computer Science Department University of California,

04/27/2011 DHT 27

Page 28: 04/27/2011DHT1 Operating System ecs251 Spring 2011 : Operating System #6: Map and Reduce Dr. S. Felix Wu Computer Science Department University of California,

04/27/2011 DHT 28

Page 29: 04/27/2011DHT1 Operating System ecs251 Spring 2011 : Operating System #6: Map and Reduce Dr. S. Felix Wu Computer Science Department University of California,

04/27/2011 DHT 29

• Complete web search engine– Nutch = Crawler + Indexer/Searcher (Lucene)

+ GUI » +Plugins» +MapReduce & Distributed FS (Hadoop)

• Java based, open source, many customizable scripts available at (http://lucene.apache.org/nutch/)

• Features:– Customizable– Extensible (e.g. extend to Solr for enhanced

portability)

Page 30: 04/27/2011DHT1 Operating System ecs251 Spring 2011 : Operating System #6: Map and Reduce Dr. S. Felix Wu Computer Science Department University of California,

04/27/2011 DHT 30

Page 31: 04/27/2011DHT1 Operating System ecs251 Spring 2011 : Operating System #6: Map and Reduce Dr. S. Felix Wu Computer Science Department University of California,

04/27/2011 DHT 31

Data Structures used by Nutch

• Web Database or WebDB– Mirrors the properties/structure of web graph

being crawled

• Segment– Intermediate index– Contains pages fetched in a single run

• Index– Final inverted index obtained by “merging”

segments (Lucene)

Page 32: 04/27/2011DHT1 Operating System ecs251 Spring 2011 : Operating System #6: Map and Reduce Dr. S. Felix Wu Computer Science Department University of California,

04/27/2011 DHT 32

WebDB

• Customized graph database• Used by Crawler only• Persistent storage for “pages” & “links”

– Page DB: Indexed by URL and hash; contains content, outlinks, fetch information & score

– Link DB: contains “source to target” links, anchor text

Page 33: 04/27/2011DHT1 Operating System ecs251 Spring 2011 : Operating System #6: Map and Reduce Dr. S. Felix Wu Computer Science Department University of California,

04/27/2011 DHT 33

Crawling

• Cyclic process– crawler generates a set of fetchlists from the

WebDB– fetchers downloads the content from the Web– the crawler updates the WebDB with new links

that were found– and then the crawler generates a new set of

fetchlists– And Repeat till you reach the “depth”

Page 34: 04/27/2011DHT1 Operating System ecs251 Spring 2011 : Operating System #6: Map and Reduce Dr. S. Felix Wu Computer Science Department University of California,

04/27/2011 DHT 34

Indexing

• Iterate through all k page sets in parallel, constructing inverted index

• Creates a “searchable document” of:– URL text– Content text– Incoming anchor text

• Other content types might have a different document fields– Eg, email has sender/receiver– Any searchable field end-user will want

• Uses Lucene text indexer

Page 35: 04/27/2011DHT1 Operating System ecs251 Spring 2011 : Operating System #6: Map and Reduce Dr. S. Felix Wu Computer Science Department University of California,

04/27/2011 DHT 35

Lucene

• Open source search project– http://lucene.apache.org

• Index & search local files– Download lucene-2.2.0.tar.gz from

http://www.apache.org/dyn/closer.cgi/lucene/java/

– Extract files

– Build an index for a directory

• java org.apache.lucene.demo.IndexFiles dir_path

– Try search at command line:

• java org.apache.lucene.demo.SearchFiles

Page 36: 04/27/2011DHT1 Operating System ecs251 Spring 2011 : Operating System #6: Map and Reduce Dr. S. Felix Wu Computer Science Department University of California,

04/27/2011 DHT 36

Lucene’s Open ArchitectureLucene’s Open Architecture

Spring 2008 36

File System

WWW

IMAPServer

FS Crawler

Larm

PDFHTMLDOCTXT…

TXTparser

PDFparser

HTMLparser

LuceneDocu-ments

StopAnalyzer

CN/DE/Analyzer

StandardAnalyzer

indexer

indexer

Index

sear

cher

sear

cher

Crawling Parsing Indexing

Searching

Lucene

Page 37: 04/27/2011DHT1 Operating System ecs251 Spring 2011 : Operating System #6: Map and Reduce Dr. S. Felix Wu Computer Science Department University of California,

04/27/2011 DHT 37

Index

Document

Document

Document

Document

Field

Field

Field

Field

Field

Name Value

Page 38: 04/27/2011DHT1 Operating System ecs251 Spring 2011 : Operating System #6: Map and Reduce Dr. S. Felix Wu Computer Science Department University of California,

04/27/2011 DHT 38

•Create an Analyser•WhitespaceAnalyzer

–divides text at whitespace

•SimpleAnalyzer–divides text at non-letters–convert to lower case

•StopAnalyzer–SimpleAnalyzer– removes stop words

•StandardAnalyzer–good for most European Languages– removes stop words–convert to lower case

Page 39: 04/27/2011DHT1 Operating System ecs251 Spring 2011 : Operating System #6: Map and Reduce Dr. S. Felix Wu Computer Science Department University of California,

04/27/2011 DHT 39

Page 40: 04/27/2011DHT1 Operating System ecs251 Spring 2011 : Operating System #6: Map and Reduce Dr. S. Felix Wu Computer Science Department University of California,

04/27/2011 DHT 40

Inverted Index (Inverted File)

Doc 1:

Penn State Football …

football

Doc 2:

Football players … State

Postingid

word doc offset

1 football Doc 1 3

Doc 1 67

Doc 2 1

2 penn Doc 1 1

3 players Doc 2 2

4 state Doc 1 2

Doc 2 13

PostingTable

Page 41: 04/27/2011DHT1 Operating System ecs251 Spring 2011 : Operating System #6: Map and Reduce Dr. S. Felix Wu Computer Science Department University of California,

04/27/2011 DHT 41

Query

Term Dictionary(Random file access)

Term Info Index(in Memory)

Constant time

Constant time

Frequency File(Random file

access)

Con

stan

t tim

e

Position File(Random file access)

Constant time

Field info(in Memory)

Constant time

Page 42: 04/27/2011DHT1 Operating System ecs251 Spring 2011 : Operating System #6: Map and Reduce Dr. S. Felix Wu Computer Science Department University of California,

04/27/2011 DHT 42

Map/Reduce Cluster Map/Reduce Cluster ImplementationImplementation

split 0split 1split 2split 3split 4

Output 0

Output 1

Input files

Output files

M map tasks

R reduce tasks

Intermediate files

Several map or reduce tasks can run on a single computer

Each intermediate file is divided into R partitions, by partitioning function

Each reduce task corresponds to one partition

Page 43: 04/27/2011DHT1 Operating System ecs251 Spring 2011 : Operating System #6: Map and Reduce Dr. S. Felix Wu Computer Science Department University of California,

04/27/2011 DHT 43

ExecutionExecution

Page 44: 04/27/2011DHT1 Operating System ecs251 Spring 2011 : Operating System #6: Map and Reduce Dr. S. Felix Wu Computer Science Department University of California,

04/27/2011 DHT 44

Hadoop Usage at Hadoop Usage at FacebookFacebook

Data warehouse running Hive 600 machines, 4800 cores, 2.4 PB disk 3200 jobs per day 50+ engineers have used Hadoop

Page 45: 04/27/2011DHT1 Operating System ecs251 Spring 2011 : Operating System #6: Map and Reduce Dr. S. Felix Wu Computer Science Department University of California,

04/27/2011 DHT 45

Facebook Data PipelineFacebook Data Pipeline

Web Servers Scribe Servers

Network Storage

Hadoop Cluster

Oracle RACMySQL

Analysts

Hive Queries

Summaries

Page 46: 04/27/2011DHT1 Operating System ecs251 Spring 2011 : Operating System #6: Map and Reduce Dr. S. Felix Wu Computer Science Department University of California,

04/27/2011 DHT 46

Facebook Job TypesFacebook Job Types

Production jobs: load data, compute statistics, detect spam, etc

Long experiments: machine learning, etc Small ad-hoc queries: Hive jobs, sampling

GOAL: Provide fast response times for small jobs and guaranteed service levels for production jobs

GOAL: Provide fast response times for small jobs and guaranteed service levels for production jobs

Page 47: 04/27/2011DHT1 Operating System ecs251 Spring 2011 : Operating System #6: Map and Reduce Dr. S. Felix Wu Computer Science Department University of California,

04/27/2011 DHT 47

Page 48: 04/27/2011DHT1 Operating System ecs251 Spring 2011 : Operating System #6: Map and Reduce Dr. S. Felix Wu Computer Science Department University of California,

04/27/2011 DHT 48

Cloud Computing Cloud Computing SchedulingScheduling

FIFO, Fair-Sharing Job scheduling with “constraints”

– Dependency– Priority-oriented– Soft Deadline

Page 49: 04/27/2011DHT1 Operating System ecs251 Spring 2011 : Operating System #6: Map and Reduce Dr. S. Felix Wu Computer Science Department University of California,

04/27/2011 DHT 49

HiveHive Developed at Facebook Used for majority of Facebook jobs “Relational database” built on Hadoop

– Maintains list of table schemas– SQL-like query language (HQL)– Can call Hadoop Streaming scripts from HQL– Supports table partitioning, clustering, complex

data types, some optimizations

Page 50: 04/27/2011DHT1 Operating System ecs251 Spring 2011 : Operating System #6: Map and Reduce Dr. S. Felix Wu Computer Science Department University of California,

04/27/2011 DHT 50

Creating a Hive TableCreating a Hive Table

CREATE TABLE page_views(viewTime INT, userid BIGINT, page_url STRING, referrer_url STRING, ip STRING COMMENT 'User IP address') COMMENT 'This is the page view table' PARTITIONED BY(dt STRING, country STRING)STORED AS SEQUENCEFILE;

• Partitioning breaks table into separate files for each (dt, country) pairEx: /hive/page_view/dt=2008-06-08,country=US /hive/page_view/dt=2008-06-08,country=CA

Page 51: 04/27/2011DHT1 Operating System ecs251 Spring 2011 : Operating System #6: Map and Reduce Dr. S. Felix Wu Computer Science Department University of California,

04/27/2011 DHT 51

Simple QuerySimple Query

SELECT page_views.* FROM page_views WHERE page_views.date >= '2008-03-01'AND page_views.date <= '2008-03-31'AND page_views.referrer_url like '%xyz.com';

• Hive only reads partition 2008-03-01,* instead of scanning entire table

• Find all page views coming from xyz.com on March 31st:

Page 52: 04/27/2011DHT1 Operating System ecs251 Spring 2011 : Operating System #6: Map and Reduce Dr. S. Felix Wu Computer Science Department University of California,

04/27/2011 DHT 52

Aggregation and JoinsAggregation and Joins

SELECT pv.page_url, u.gender, COUNT(DISTINCT u.id)FROM page_views pv JOIN user u ON (pv.userid = u.id)GROUP BY pv.page_url, u.genderWHERE pv.date = '2008-03-03';

• Count users who visited each page by gender:

• Sample output: