MapReduce Simplified Data Processing on Large
Clusters
Google , Inc.
Presented by
Noha El-Prince
Winter 2011
Problem and Motivations � Large Data Size
� Limited CPU Powers
� Difficulties of Distributed , Parallel Computing
2
MapReduce
� MapReduce is a Software framework
� introduced by Google
� Enables automatic parallelization and distribution of large-scale computations
� Hides the details of parallelization, data distribution, load balancing and fault tolerance.
� Achieves high performance
3
Outline � MapReduce : Execution Example
� Programming Model
� MapReduce: Distributed Execution
� More Examples
� Customization on Cluster
� Refinements
� Performance measurement
� Conclusion and Future Work
� MapReduce in other companies
4
5
Programming Model
Raw Data
MapReduce Library
Reduced Processed
data
Mu
(k,v)
(k’,v’)
Intermediate data
Ru
(k’,<v’>*) <k’, v’>*
Example q Input:
� Page 1: the weather is good
� Page 2: today is good � Page 3: good weather is good.
q Output Desired:
The frequency each word is encountered in all pages.
(the 1), (is 3), (weather 2),(today 1), (good 4)
6
(The,1) (weather,1) (is,1) (good,1) (Today,1) (is,1) (good,1) (good,1) (weather,1) (is,1) (good,1)
Intermediate Data
The weather is good Today is good Good weather is good
M M M M M M M
Input Data
(The,[1]) (weather,[1,1]) (is,[1,1,1]) (good,[1,1,1]) (Today,[1])
Group by Key
Grouped Data
map(key, value): for each word w in value:
emit(w, 1)
R R R R R
Output Data (The,1) (weather, 2) (is, 3) (good,3) (Today,1) 7
reduce(key, values): result=0 for each count v in values result += v emit(key, result)
Programming Model § Input : A set of key/value pairs
§ Programmer specifies two functions:
8
Map • map (k,v) à <k’, v’>
Reduce • reduce (k’,<v’>*) à <k’,v’>*
All v’ with same k’ are reduced together
Distributed Execution Overview
Worker
Worker
Master
Worker
Worker
Worker
fork fork fork
assign map
assign reduce
read local write
remote read, sort
Output File 0
Output File 1
write
Split 0 Split 1 Split 2
Input Data
9
MapReduce Examples
10
Distributed Grep:
MAP RED Virus,
[A..,B..]
A….virus B……. C..virus… virus, A…
virus, B…
Search pattern (key) : virus
Web Page
� Count of URL Access Frequency:
11
MapReduce Examples
www.cbc.com www.cnn.com www.bbc.com www.cbc.com www.cbc.com www.bbc.com
Web server logs
MAP CBC, [1,1,1] CNN [1] BBC [1,1]
RED
CBC, 3 BBC, 2 CNN, 1
q Reverse Web-Link Graph:
12
MapReduce Examples
Web server logs
MAP (facebook,youtube) (facebook, disney)
RE
D
(Facebook, [youtube, disney])
Facebook.com Twitter.com
Facebook.com
www.youtube.com
www.disney.com
target www.facebook.com source
MapReduce Examples
q Term-Vector per Host:.
13
Documents of the facebook (hostname)
MAP
<facebook, word1> <facebook, word2> <facebook, word2> <facebook, word2>
<facebook, [word2, …]>
….
Summary of the most popular words
RE
D
14
q Inverted Index:
14
MapReduce Examples
Docs
<word1, docID1> <word2, docID1> … <word3, docID2> <word1, docID2> … <word1, docID3>
<word1, [docID1, docID2, docID3]>
<word2, [docID1]>
RED MAP
Outline � MapReduce : Execution
� Example
� Programming Model
� MapReduce: Distributed Execution
� More Examples
� Customizations on Clusters
� Refinements
� Performance measurement
� Conclusion & Future Work
� MapReduce in other companies
þ
þ
þ
þ
þ
15
Customizations on Clusters
� Coordination
� Scheduling
� Fault Tolerance
� Task Granularity
� Backup Tasks
16
q Coordination
Customizations on Clusters
Master Data Structure
17
M 250.133.22.7 Completed Root/intFile.txt
M 250.133.22.8 inprogress Root/intFile.txt
R 250.123.23.3 idle Root/outFile.txt
q Scheduling Master scheduling policy: (objective: conserve network bandwidth) 1. GFS divides each file into 64MB block.
2. I/P data are stored on the worker’s local disks (managed by GFS)
Ø Locality :using the same cluster for both data storage and data processing.
3. GFS stores multiple copies of each block (typically 3 copies) on different machines.
Customizations on Clusters
18
q Fault Tolerance
On worker failure: • Detect failure via periodic heartbeats • Re-execute completed and in-progress map tasks • Re-execute in progress reduce tasks • Task completion committed through master
Master failure: • Could handle, but don't yet (master failure unlikely) • MapReduce task is aborted and client is notified
Customizations on Clusters
19
q Task Granularity (How tasks are divided ?)
Rule of thumb:
Make M and R much larger than the number of worker machines à Improves dynamic load balancing à speeds recovery from worker failure
Customizations on Clusters
20 Usually R is smaller than M
q Backup tasks
� Problem of stragglers (machines taking long time to complete one of the last few tasks )
� When a MapReduce operation is about to complete: Ø Master schedules backup executions of the
remaining tasks Ø Task is marked “complete” whenever either the
primary or the backup execution completes.
Customizations on Clusters
21
Effect: dramatically shortens job completion time
Outline � MapReduce : Execution
� Example
� Programming Model
� MapReduce: Distributed Execution
� More Examples
� Customizations on Clusters
� Refinements
� Performance measurement
� Conclusion & Future Work
� Companies using MapReduce
þ
þ
þ
þ
þ
þ
22
Refinements
� Partitioning functions.
� Skipping bad records.
� Status info.
� Other Refinements
23
� MapRedue users specify no. of tasks/output files desired (R)
� For reduce, we need to ensure that records with the same intermediate key end up at the same worker
� System uses a default partition function
e.g., hash(key) mod R ( results fairly well-balanced partitions )
� Sometimes useful to override
� E.g., hash(hostname(URL key)) mod R Ø ensures URLs from a host end up in the same output file
Refinements : Partitioning Function
24
§ Map/Reduce functions sometimes fail for particular inputs
• MapReduce has a special treatment for ‘bad’ input data, i.e. input data that repeatedly leads to the crash of a task. Ø The master, tracking crashes of tasks, recognizes
such situations and, after a number of failed retries, will decide to ignore this piece of data.
• Effect: Can work around bugs in third-party libraries
25
Refinements : Skipping Bad Records
� Status pages shows the computation progress
� Links to standard error and output files generated by each task.
� User can Ø Predict the computational length Ø Add more resources if needed
Ø Know which workers have failed
� Useful in user code bug diagnosis
26
Refinements : Status Information
27
§ Combiner function: Compression of intermediate data
Ø useful for saving network bandwidth
Other Refinements
§ User-defined counters Ø periodically propagated to the master from worker machines
Ø Useful for checking behavior of MaReduce operations (appears on master status page )
Outline � MapReduce : Execution
� Example
� Programming Model
� MapReduce: Distributed Execution
� More Examples
� Customizations on Clusters
� Refinements
� Performance measurement
� Conclusion & Future Work
� Companies using MapReduce
þ
þ
þ
þ
þ
þ
28
þ
Performance
§ Tests run on cluster of 1800 machines: each machine has:
� 4 GB of memory
� Dual-processor 2 GHz Xeons with Hyperthreading
� Dual 160 GB IDE disks
� Gigabit Ethernet link
� Bisection bandwidth approximately 100-200 Gbps
§ Two benchmarks:
� Grep: Scan 1010 100-byte records to extract records matching a rare pattern (92K matching records)
� Sort: Sort 1010 100-byte records
29
Grep
• 1800 machines read 1 TB of data at peak of ~31 GB/s
• Startup overhead is significant for short jobs (entire computation = 80 + 1 minute start up
30
M=15000 (input split= 64MB) R=1 Assume all machines has same host Search pattern: 3 characters Found in: 92,337 records
1764 workers
Sort
31
(a) Normal Execution (b) No backup tasks (c) 200 tasks killed
M=15000 (input split= 64MB) R=4000, # of workers = 1746 Fig.(a) Btr than Terasoft benchmark reported result of 1057 s
(a) Locality optimization èInput rate > shuffle rate and output rate Output phase writes 2 copies of sorted data è Shuffle rate > output rate (b) 5 Stragglers à Entire computation rate increases 44% than normal
Experience: Rewrite of Production Indexing System
§ New code is simpler, easier to understand
§ MapReduce takes care of failures, slow machines
§ Easy to make indexing faster by adding more machines
32
Outline � MapReduce : Execution
� Example
� Programming Model
� MapReduce: Distributed Execution
� More Examples
� Customizations on Clusters
� Refinements
� Performance measurement
� Conclusion & Future Work
� Companies using MapReduce
þ
þ
þ
þ
þ
þ
33
þ
þ
Conclusion & Future Work � MapReduce has proven to be a useful abstraction
� Greatly simplifies large-scale computations
� Fun to use: focus on problem, let library deal w/ messy details
34
MapReduce Advantages/Disadvantages
Now it’s easy to program for many CPUs • Communication management effectively gone
Ø I/O scheduling done for us • Fault tolerance, monitoring
Ø machine failures, suddenly-slow machines, etc are handled • Can be much easier to design and program! • Can cascade several (many?) MapReduce tasks
But … it further restricts solvable problems • Might be hard to express problem in MapReduce • Data parallelism is key
Ø Need to be able to break up a problem by data chunks • MapReduce is closed-source (to Google) C++
Ø Hadoop is open-source Java-based rewrite 35
Outline � MapReduce : Execution
� Example
� Programming Model
� MapReduce: Distributed Execution
� More Examples
� Customizations on Clusters
� Refinements
� Performance measurement
� Conclusion & Future Work
� Companies using MapReduce
þ
þ
þ
þ
þ
þ
36
þ
þ
þ
Companies using MapReduce v Amazon: Amazon Elastic MapReduce :
§ a web service § enables businesses, researchers, data analysts, and
developers to easily and cost-effectively process vast amounts of data.
§ It utilizes a hosted Hadoop framework running on the web-scale infrastructure of Amazon Elastic Compute Cloud (Amazon EC2) and Amazon Simple Storage Service (Amazon S3).
§ allows you to use Hadoop with no hardware investment
� http://aws.amazon.com/elasticmapreduce/
37
� Amazon: to build product search indices
� Facebook: processing of web logs, via both Map-Reduce and Hive
� IBM and Google: making large compute clusters available to higher ed and research organizations
� New York Times: large scale image conversions
� Yahoo: use Map Reduce and Pig for web log processing, data model training, web map construction, and much, much more
� Many universities for teaching parallel and large data systems
And many more, see them all at
http://wiki.apache.org/hadoop/ PoweredBy 38
Companies using MapReduce
39
Top Related