Apache Hadoop MapReduce What is it ? Why use it ? How does it work Some examples Big users.

Apache Hadoop MapReduce

What is it ? Why use it ? How does it work Some examples Big users

MapReduce – What is it ?

Processing engine of Hadoop Developers create Map and Reduce jobs Used for big data batch processing Parallel processing of huge data volumes Fault tolerant Scalable

MapReduce – Why use it ?

Your data in Terabyte / Petabyte range You have huge I/O Hadoop framework takes care of

Job and task managementFailuresStorageReplication You just write Map and Reduce jobs

MapReduce – How does it work ?

Take word counting as an example, something that Google does all of the time.

MapReduce – How does it work ?

Input data split into shards Split data mapped to key,value pairs i.e. Bear,1 Mapped data shuffled/sorted by key i.e. Bear Sorted data reduced i.e. Bear, 2 Final data stored on HDFS There might be extra map layer before shuffle JobTracker controls all tasks in job TaskTracker controls map and reduce

MapReduce - Some examples

A visual example with colours to show you the cycleSplit -> Map -> Shuffle -> Reduce

MapReduce - Some examples

A visual example of MapReduce with job and task trackers added to individual map and reduce jobs.

Hadoop MapReduce – Big users

UsersFacebook Yahoo Amazon Ebay

Apache Hadoop MapReduce What is it ? Why use it ? How does it work Some examples Big users.

Documents

Transcript of Apache Hadoop MapReduce What is it ? Why use it ? How does it work Some examples Big users.

Hadoop Mapreduce

Hadoop MapReduce - OSDC FR 2009

Hadoop Training #5: MapReduce Algorithm

Python MapReduce Programming with Pydoop · MapReduce and Hadoop Hadoop Crash Course Pydoop: a Python MapReduce and HDFS API for Hadoop Python MapReduce Programming with Pydoop Simone

MapReduce and Hadoop

MapReduce Improvements in MapR Hadoop

CS-495/595 Big DataCS-495/595 Big Data:::: Exam #1Exam ...ccartled/Teaching/2015-Spring/Exams/001.pdf– Hadoop, Pig– Hadoop, Pig– Hadoop, Pig – MapReduce– MapReduce– MapReduce

Hadoop hbase mapreduce

MapReduce - uni-bielefeld.dejkrueger/documents/... · … and praxis MapReduce using Hadoop Hadoop was created by Doug Cutting, who named it after his son's stuffed elephant. Hadoop

Introduction to Hadoop and MapReduce

MapReduce & Hadoop IIcslui/CMSC5702/mapreduce_hadoop2.pdf · MapReduce & Hadoop II ... MapReduce & Hadoop MapReduce Recap ... example, the combiners aggregate term counts across the

MapReduce en Hadoop

MapReduce and Hadoop

Hadoop 2 - More than MapReduce

Hadoop MapReduce Fundamentals

Hadoop Programming. Overview MapReduce Types Input Formats Output Formats Serialization Job g/apache/hadoop/mapreduce/package-

Parallel video transcoding using Hadoop MapReduce · 06-01-2017 · 3.2 The distributed video transcoding using Hadoop MapReduce. Distributed video transcoding based on Hadoop MapReduce

Hadoop: Beyond MapReduce

ApproxHadoop: Bringing Approximations to MapReduce Frameworkssantosh.nagarakatte/... · Hadoop. Hadoop is the best-known, publicly available im-plementation of MapReduce [1]. Hadoop

Big data- hadoop -MapReduce