Hortonworks Yarn Code Walk Through January 2014
-
Upload
hortonworks -
Category
Technology
-
view
107 -
download
1
description
Transcript of Hortonworks Yarn Code Walk Through January 2014
![Page 1: Hortonworks Yarn Code Walk Through January 2014](https://reader035.fdocuments.us/reader035/viewer/2022070304/54c6569b4a7959d9368b459b/html5/thumbnails/1.jpg)
© Hortonworks Inc. 2013
YARN Code OverviewOcular bleeding is no reason to stop programing!
Page 1
![Page 2: Hortonworks Yarn Code Walk Through January 2014](https://reader035.fdocuments.us/reader035/viewer/2022070304/54c6569b4a7959d9368b459b/html5/thumbnails/2.jpg)
© Hortonworks Inc. 2013
Quick Bio – Joseph Niemiec
• Hadoop user for 2+ years• 1 of 5 Author’s for Apache Hadoop YARN (March 2014)
• Originally used Hadoop for location based services –Destination Prediction–Traffic Analysis–Effects of weather at client locations on call center call types
• Pending Patent in Automotive/Telematics domain• Defensive Paper on M2M Validation• Started on analytics to be better at an MMORPG
![Page 3: Hortonworks Yarn Code Walk Through January 2014](https://reader035.fdocuments.us/reader035/viewer/2022070304/54c6569b4a7959d9368b459b/html5/thumbnails/3.jpg)
© Hortonworks Inc. 2013
Agenda
• What Is YARN• YARN Concepts & Architecture• Code and more Code• Q&A
Page 3
![Page 4: Hortonworks Yarn Code Walk Through January 2014](https://reader035.fdocuments.us/reader035/viewer/2022070304/54c6569b4a7959d9368b459b/html5/thumbnails/4.jpg)
© Hortonworks Inc. 2013
From Batch To Anything
HADOOP 1.0
HDFS(redundant, reliable storage)
MapReduce(cluster resource management
& data processing)
HDFS2(redundant, reliable storage)
YARN(cluster resource management)
MapReduce(data processing)
Others(data processing)
HADOOP 2.0
Single Use SystemBatch Apps
Multi Purpose PlatformBatch, Interactive, Online, Streaming, …
Page 4
![Page 5: Hortonworks Yarn Code Walk Through January 2014](https://reader035.fdocuments.us/reader035/viewer/2022070304/54c6569b4a7959d9368b459b/html5/thumbnails/5.jpg)
© Hortonworks Inc. 2013Page 5
Concepts
• Application–Application is a job submitted to the framework–Examples
– Map Reduce Job – MoYa Cluster
• Container–Basic unit of allocation–Fine-grained resource allocation across multiple resource
types (memory, cpu, disk, network, gpu etc.)– container_0 = 2GB, 1CPU– container_1 = 1GB, 6 CPU
–Replaces the fixed map/reduce slots
![Page 6: Hortonworks Yarn Code Walk Through January 2014](https://reader035.fdocuments.us/reader035/viewer/2022070304/54c6569b4a7959d9368b459b/html5/thumbnails/6.jpg)
© Hortonworks Inc. 2013Page 6
Architecture
• Resource Manager–Global resource scheduler–Hierarchical queues
• Node Manager–Per-machine agent–Manages the life-cycle of container–Container resource monitoring
• Application Master–Per-application–Manages application scheduling and task execution–E.g. MapReduce Application Master
![Page 7: Hortonworks Yarn Code Walk Through January 2014](https://reader035.fdocuments.us/reader035/viewer/2022070304/54c6569b4a7959d9368b459b/html5/thumbnails/7.jpg)
© Hortonworks Inc. 2013
To the code!
Page 7
![Page 8: Hortonworks Yarn Code Walk Through January 2014](https://reader035.fdocuments.us/reader035/viewer/2022070304/54c6569b4a7959d9368b459b/html5/thumbnails/8.jpg)
© Hortonworks Inc. 2013
Q&A
Page 8
![Page 9: Hortonworks Yarn Code Walk Through January 2014](https://reader035.fdocuments.us/reader035/viewer/2022070304/54c6569b4a7959d9368b459b/html5/thumbnails/9.jpg)
© Hortonworks Inc. 2013
YARN - ApplicationMaster
• ApplicationMaster–ApplicationSubmissionContext is the complete
specification of the ApplicationMaster, provided by Client–ResourceManager responsible for allocating and launching
ApplicationMaster container
Page 9
ApplicationSubmissionContext
resourceRequest
containerLaunchContext
appName
queue
![Page 10: Hortonworks Yarn Code Walk Through January 2014](https://reader035.fdocuments.us/reader035/viewer/2022070304/54c6569b4a7959d9368b459b/html5/thumbnails/10.jpg)
© Hortonworks Inc. 2013
YARN – Resource Allocation & Usage
• ContainerLaunchContext–The context provided by ApplicationMaster to NodeManager to
launch the Container–Complete specification for a process–LocalResource used to specify container binary and
dependencies– NodeManager responsible for downloading from shared namespace
(typically HDFS)
Page 10
ContainerLaunchContextcontainer
commands
environment
localResources LocalResourceuri
type
![Page 11: Hortonworks Yarn Code Walk Through January 2014](https://reader035.fdocuments.us/reader035/viewer/2022070304/54c6569b4a7959d9368b459b/html5/thumbnails/11.jpg)
© Hortonworks Inc. 2013
YARN – Resource Allocation & Usage
• ResourceRequest
Page 11
priority capability resourceName numContainers
0 <2gb, 1 core>host01 1
rack0 1
* 1
1 <4gb, 1 core> * 1
![Page 12: Hortonworks Yarn Code Walk Through January 2014](https://reader035.fdocuments.us/reader035/viewer/2022070304/54c6569b4a7959d9368b459b/html5/thumbnails/12.jpg)
© Hortonworks Inc. 2013
YARN – Resource Allocation & Usage
• Container–The basic unit of allocation in YARN–The result of the ResourceRequest provided by
ResourceManager to the ApplicationMaster–A specific amount of resources (cpu, memory etc.) on a specific
machine
Page 12
ContainercontainerId
resourceName
capability
tokens