MapReduce - University of Southern Californiacacs.usc.edu/education/cs596/08-1MapReduce.pdf ·...
Transcript of MapReduce - University of Southern Californiacacs.usc.edu/education/cs596/08-1MapReduce.pdf ·...
![Page 1: MapReduce - University of Southern Californiacacs.usc.edu/education/cs596/08-1MapReduce.pdf · MapReduce Aiichiro Nakano ... Example: Counting Words map(String key, String value):](https://reader031.fdocuments.us/reader031/viewer/2022022513/5aee05a27f8b9ac57a8b515d/html5/thumbnails/1.jpg)
MapReduce
Aiichiro NakanoCollaboratory for Advanced Computing & Simulations
Dept. of Computer Science, Dept. of Physics & Astronomy, Dept. of Chemical Engineering & Materials Science
Dept. of Biological SciencesUniversity of Southern California
Email: [email protected]
Amazon Elastic Computing Cloud (EC2)or
Hadoop@USC-HPC
![Page 2: MapReduce - University of Southern Californiacacs.usc.edu/education/cs596/08-1MapReduce.pdf · MapReduce Aiichiro Nakano ... Example: Counting Words map(String key, String value):](https://reader031.fdocuments.us/reader031/viewer/2022022513/5aee05a27f8b9ac57a8b515d/html5/thumbnails/2.jpg)
Cloud Computing
![Page 3: MapReduce - University of Southern Californiacacs.usc.edu/education/cs596/08-1MapReduce.pdf · MapReduce Aiichiro Nakano ... Example: Counting Words map(String key, String value):](https://reader031.fdocuments.us/reader031/viewer/2022022513/5aee05a27f8b9ac57a8b515d/html5/thumbnails/3.jpg)
MapReduce
• Parallel programming model for data-intensive applications on large clusters> User just implements Map() and Reduce()
• Parallel computing framework> Libraries take care of everything else
- Parallelization- Fault tolerance- Data distribution- Load balancing
• Developed at Google
![Page 4: MapReduce - University of Southern Californiacacs.usc.edu/education/cs596/08-1MapReduce.pdf · MapReduce Aiichiro Nakano ... Example: Counting Words map(String key, String value):](https://reader031.fdocuments.us/reader031/viewer/2022022513/5aee05a27f8b9ac57a8b515d/html5/thumbnails/4.jpg)
Functional Abstraction
• Map and Reduce functions borrowed from functional programming languages(Common LISP example)> (mapcar ’1+ ’(1 2 3 4)) Þ (2 3 4 5)> (reduce ’+ ’(1 2 3 4)) Þ 10
• Map()> Process a key/value pair to generate intermediate
key/value pairs• Reduce()
> Merge all intermediate values associated with the samekey
cf. MPI_Allreduce()
![Page 5: MapReduce - University of Southern Californiacacs.usc.edu/education/cs596/08-1MapReduce.pdf · MapReduce Aiichiro Nakano ... Example: Counting Words map(String key, String value):](https://reader031.fdocuments.us/reader031/viewer/2022022513/5aee05a27f8b9ac57a8b515d/html5/thumbnails/5.jpg)
Example: Counting Words
map(String key, String value): // key: document name // value: document contents for each word w in value:
EmitIntermediate(w, “1”);
reduce(String key, Iterator values): // key: a word // values: a list of counts int result = 0; for each v in values:
result += ParseInt(v);Emit(AsString(result));
• Map()> Input <filename, file text>> Parses file and emits <word, count> pairs
- e.g. <“hello”, 1>• Reduce()
> Sums values for the same key and emits <word, TotalCount>- e.g. <“hello”, (3 5 2 7)> Þ <“hello”, 17>
![Page 6: MapReduce - University of Southern Californiacacs.usc.edu/education/cs596/08-1MapReduce.pdf · MapReduce Aiichiro Nakano ... Example: Counting Words map(String key, String value):](https://reader031.fdocuments.us/reader031/viewer/2022022513/5aee05a27f8b9ac57a8b515d/html5/thumbnails/6.jpg)
Parallel Execution
![Page 7: MapReduce - University of Southern Californiacacs.usc.edu/education/cs596/08-1MapReduce.pdf · MapReduce Aiichiro Nakano ... Example: Counting Words map(String key, String value):](https://reader031.fdocuments.us/reader031/viewer/2022022513/5aee05a27f8b9ac57a8b515d/html5/thumbnails/7.jpg)
MapReduce Resources
• Hadoop implementationhttp://hadoop.apache.org
• MapReduce tutorialhttp://hadoop.apache.org/common/docs/current/mapred_tutorial.html
• PaperJ. Dean and S. Ghemawat, “MapReduce: simplified data processing on large clusters,” Communications of the ACM 51(1), 107 (’08)
• Free account (Amazon Web Services) http://aws.amazon.com/free
![Page 8: MapReduce - University of Southern Californiacacs.usc.edu/education/cs596/08-1MapReduce.pdf · MapReduce Aiichiro Nakano ... Example: Counting Words map(String key, String value):](https://reader031.fdocuments.us/reader031/viewer/2022022513/5aee05a27f8b9ac57a8b515d/html5/thumbnails/8.jpg)
Cloud Supercomputing