Timeline Service v.2 (Hadoop Summit 2016)
-
Upload
sangjin-lee -
Category
Software
-
view
655 -
download
0
Transcript of Timeline Service v.2 (Hadoop Summit 2016)
![Page 1: Timeline Service v.2 (Hadoop Summit 2016)](https://reader031.fdocuments.us/reader031/viewer/2022030314/58890c221a28ab4a5c8b4f0f/html5/thumbnails/1.jpg)
(Big Data)2
How YARN Timeline Service v.2 Unlocks 360-Degree Pla@orm Insights at Scale
Sangjin Lee @sjlee (Twi5er) Li Lu (Hortonworks)
Vrushali Channapa5an @vrushalivc (Twi5er)
![Page 2: Timeline Service v.2 (Hadoop Summit 2016)](https://reader031.fdocuments.us/reader031/viewer/2022030314/58890c221a28ab4a5c8b4f0f/html5/thumbnails/2.jpg)
Outline• Why v.2?
• Highlights
• Developing for Timeline Service v.2
• SeIng up Timeline Service v.2
• Milestones
• Demo
![Page 3: Timeline Service v.2 (Hadoop Summit 2016)](https://reader031.fdocuments.us/reader031/viewer/2022030314/58890c221a28ab4a5c8b4f0f/html5/thumbnails/3.jpg)
Why v.2?
• YARN Timeline Service v 1.x
• Gained good adopSon: Tez, HIVE, Pig, etc.
• Keeps improving with v 1.5 APIs and storage implementaSon
• SSll facing some fundamental challenges...
![Page 4: Timeline Service v.2 (Hadoop Summit 2016)](https://reader031.fdocuments.us/reader031/viewer/2022030314/58890c221a28ab4a5c8b4f0f/html5/thumbnails/4.jpg)
Why v.2?• Scalability and reliability challenges
• Single instance of Timeline Server
• Storage (single local LevelDB instance)
• Usability
• Flow
• Metrics and configuraSon as first-class ciSzens
• Metrics aggregaSon up the enSty hierarchy
![Page 5: Timeline Service v.2 (Hadoop Summit 2016)](https://reader031.fdocuments.us/reader031/viewer/2022030314/58890c221a28ab4a5c8b4f0f/html5/thumbnails/5.jpg)
Highlightsv.1 v.2
Single writer/reader Timeline Server Distributed writer/collector architecture
Single local LevelDB storage* Scalable storage (HBase)
v.1 enSty model New v.2 enSty model
No aggregaSon Metrics aggregaSon
REST API Richer query REST API
![Page 6: Timeline Service v.2 (Hadoop Summit 2016)](https://reader031.fdocuments.us/reader031/viewer/2022030314/58890c221a28ab4a5c8b4f0f/html5/thumbnails/6.jpg)
Architecture• SeparaSon of writers (“collectors”) and readers
• Distributed collectors: one collector for each app
• Dedicated RM collector for RM-generated data
• Collector discovery via RM
• Pluggable storage with HBase as default storage
![Page 7: Timeline Service v.2 (Hadoop Summit 2016)](https://reader031.fdocuments.us/reader031/viewer/2022030314/58890c221a28ab4a5c8b4f0f/html5/thumbnails/7.jpg)
Distributed collectors & readers
!melinereader
!melinereader
Storage
!melinereader
AM !melinecollector
NM
!meline reader pool
app metrics/events
container events/metrics
RM
!meline collector
app/container events
user queries
(worker node running AM)
(worker node running containers)
write flowread flow
![Page 8: Timeline Service v.2 (Hadoop Summit 2016)](https://reader031.fdocuments.us/reader031/viewer/2022030314/58890c221a28ab4a5c8b4f0f/html5/thumbnails/8.jpg)
Collector discovery
RMAM
app id => address
! start AM container
NM
3melinecollector
" node heartbeat
# allocate response
worker node
3melineclient
![Page 9: Timeline Service v.2 (Hadoop Summit 2016)](https://reader031.fdocuments.us/reader031/viewer/2022030314/58890c221a28ab4a5c8b4f0f/html5/thumbnails/9.jpg)
New enSty model
• Flows and flow runs as parents of YARN applicaSon enSSes
• First-class configuraSon (key-value pairs)
• First-class metrics (single-value or Sme series)
• Designed to handle mulS-cluster environment out of the box
![Page 10: Timeline Service v.2 (Hadoop Summit 2016)](https://reader031.fdocuments.us/reader031/viewer/2022030314/58890c221a28ab4a5c8b4f0f/html5/thumbnails/10.jpg)
What is a flow?• A flow is a group of YARN
applicaSons that are launched as parts of a logical app
• Oozie, Scalding, Pig, etc.
• name: “frequent_visitor_stat”
• run id: 1466097809000
• version: “b9b9068”
![Page 11: Timeline Service v.2 (Hadoop Summit 2016)](https://reader031.fdocuments.us/reader031/viewer/2022030314/58890c221a28ab4a5c8b4f0f/html5/thumbnails/11.jpg)
ConfiguraSon and metrics
• Now explicit top-level a5ributes of enSSes
• Fine-grained updates and queries made possible
• “update metric A to value x”
• “query enMMes where config A = B”
container 1_1
metric: A = 10
metric: B = 100
config: "Foo" = "bar"
![Page 12: Timeline Service v.2 (Hadoop Summit 2016)](https://reader031.fdocuments.us/reader031/viewer/2022030314/58890c221a28ab4a5c8b4f0f/html5/thumbnails/12.jpg)
ConfiguraSon and metrics
• Now explicit top-level a5ributes of enSSes
• Fine-grained updates and queries made possible
• “update metric A to value x”
• “query enMMes where config A = B”
container 1_1
metric: A = 50
metric: B = 100
config: "Foo" = "bar"
![Page 13: Timeline Service v.2 (Hadoop Summit 2016)](https://reader031.fdocuments.us/reader031/viewer/2022030314/58890c221a28ab4a5c8b4f0f/html5/thumbnails/13.jpg)
HBase Storage• Scalable backend
• Row Key structure
• efficient range scans
• KeyPrefixRegionSplitPolicy
• Filter pushdown
• Coprocessors for flow aggregaSon (“readless” aggregaSon)
• Cell tags for metadata (applicaSon id, aggregaSon operaSon)
• Cell Smestamps generated during put
• lei shiied with app id added to avoid overwrites
![Page 14: Timeline Service v.2 (Hadoop Summit 2016)](https://reader031.fdocuments.us/reader031/viewer/2022030314/58890c221a28ab4a5c8b4f0f/html5/thumbnails/14.jpg)
Tables in HBase• flow run
• application
• entity
• flow activity
• app to flow
![Page 15: Timeline Service v.2 (Hadoop Summit 2016)](https://reader031.fdocuments.us/reader031/viewer/2022030314/58890c221a28ab4a5c8b4f0f/html5/thumbnails/15.jpg)
table: flow run
Row key:
clusterId!userName!flowName!inverted(flowRunId)
most recent flow run stored first
coprocessor enabled
![Page 16: Timeline Service v.2 (Hadoop Summit 2016)](https://reader031.fdocuments.us/reader031/viewer/2022030314/58890c221a28ab4a5c8b4f0f/html5/thumbnails/16.jpg)
table: applicaSonRow key:
clusterId!userName!flowName!inverted(flowRunId)!AppId
applicaSons within a flow run stored
together
most recent flow run stored first
![Page 17: Timeline Service v.2 (Hadoop Summit 2016)](https://reader031.fdocuments.us/reader031/viewer/2022030314/58890c221a28ab4a5c8b4f0f/html5/thumbnails/17.jpg)
table: enStyRow key:
userName!clusterId!flowName!inverted(flowRunId)!AppId!entityType!entityId
enSSes within an applicaSon within a flow run stored together per
type
• for example, all containers within a yarn applicaSon will be
stored together
pre-split table
stores information per entity run like info, relatesTo, relatedTo, events, metrics, config
![Page 18: Timeline Service v.2 (Hadoop Summit 2016)](https://reader031.fdocuments.us/reader031/viewer/2022030314/58890c221a28ab4a5c8b4f0f/html5/thumbnails/18.jpg)
table: flow acSvityRow key:
clusterId!inverted(TopOfTheDay)!userName!flowName
shows the flows that ran on that day
stores informaSon per flow like number of
runs, the run ids, versions
![Page 19: Timeline Service v.2 (Hadoop Summit 2016)](https://reader031.fdocuments.us/reader031/viewer/2022030314/58890c221a28ab4a5c8b4f0f/html5/thumbnails/19.jpg)
table: appToFlowRow key:
clusterId!appId
- stores mapping of appId to
flowName and flowRunId
![Page 20: Timeline Service v.2 (Hadoop Summit 2016)](https://reader031.fdocuments.us/reader031/viewer/2022030314/58890c221a28ab4a5c8b4f0f/html5/thumbnails/20.jpg)
Metrics aggregaSon• ApplicaSon level
• Rolls up sub-applicaSon metrics
• Performed in real Sme in the collectors in memory
• Flow run level
• Rolls up app level metrics
• Performed in HBase region servers via coprocessors
• Offline aggregaSon (TBD)
• Rolls up on user, queue, and flow offline periodically
• Phoenix tables
Container 1_1“bytes” : 23
Container 1_2“bytes” : 135
Container 2_1“bytes” : 50
Container 3_1“bytes” : 64
App1“bytes”: 158
App2“bytes”: 50
App3“bytes”: 64
flow1“bytes”: 208
flow2“bytes”: 64
user1“bytes”: 272
queue1“bytes”: 272
App aggregationIn collector
flow aggregationIn hbase
offline aggregation
![Page 21: Timeline Service v.2 (Hadoop Summit 2016)](https://reader031.fdocuments.us/reader031/viewer/2022030314/58890c221a28ab4a5c8b4f0f/html5/thumbnails/21.jpg)
FlowRun Aggrega:on
via the HBase Coprocessor
App Metrics
Cells in
HBase
FlowRun Metric Sum
![Page 22: Timeline Service v.2 (Hadoop Summit 2016)](https://reader031.fdocuments.us/reader031/viewer/2022030314/58890c221a28ab4a5c8b4f0f/html5/thumbnails/22.jpg)
App Metrics
Cells in
HBase
FlowRun Metric Sum
FlowRun Aggrega:on
via the HBase Coprocessor
![Page 23: Timeline Service v.2 (Hadoop Summit 2016)](https://reader031.fdocuments.us/reader031/viewer/2022030314/58890c221a28ab4a5c8b4f0f/html5/thumbnails/23.jpg)
Reader REST API: paths• URLs under /ws/v2/Smeline
• Canonical REST style URLs: /ws/v2/Smeline/clusters/cluster_name/users/user_name/flows/flow_name/runs/run_id
• Path elements may be omi5ed if they can be inferred
• flow context can be inferred by app id
• default cluster is assumed if cluster is omi5ed
![Page 24: Timeline Service v.2 (Hadoop Summit 2016)](https://reader031.fdocuments.us/reader031/viewer/2022030314/58890c221a28ab4a5c8b4f0f/html5/thumbnails/24.jpg)
Reader REST API: query params• limit, createdTimeStart, createdTimeEnd: constrain the enSSes
• fields (ALL | EVENTS | INFO | CONFIGS | METRICS | RELATES_TO | IS_RELATED_TO): limit the contents to return
• metricsToRetrieve, confsToRetrieve: further limit the contents to return
• metricsLimit: limits the number of values in a Sme series
![Page 25: Timeline Service v.2 (Hadoop Summit 2016)](https://reader031.fdocuments.us/reader031/viewer/2022030314/58890c221a28ab4a5c8b4f0f/html5/thumbnails/25.jpg)
Reader REST API: query params
• relatesTo, isRelatedTo: filters by associaSon
• *Filters: filters by info, config, metric, event, …
• Supports complex filters including operators
• metricFilter=(((metric1 eq 50) AND (metric2 gt 40)) OR (metric1 lt 20))
![Page 26: Timeline Service v.2 (Hadoop Summit 2016)](https://reader031.fdocuments.us/reader031/viewer/2022030314/58890c221a28ab4a5c8b4f0f/html5/thumbnails/26.jpg)
Developing: TimelineClientIn your application master:
// create TimelineClient v.2 style TimelineClient client = TimelineClient.createTimelineClient(appId); client.init(conf); client.start();
// bind it to AM/RM client to receive the collector address amRMClient.registerTimelineClient(client);
// create and write timeline entities TimelineEntity entity = new TimelineEntity(); client.putEntities(entity);
// when the app is complete, stop the timeline client client.stop();
![Page 27: Timeline Service v.2 (Hadoop Summit 2016)](https://reader031.fdocuments.us/reader031/viewer/2022030314/58890c221a28ab4a5c8b4f0f/html5/thumbnails/27.jpg)
Developing: Flow contextIn your app submitter:
ApplicationSubmissionContext appContext = app.getApplicationSubmissionContext();
// set the flow context as YARN application tags Set<String> tags = new HashSet<>(); tags.add(TimelineUtils.generateFlowNameTag("distributed grep")); tags.add(TimelineUtils.generateFlowVersionTag( "3df8b0d6100530080d2e0decf9e528e57c42a90a")); tags.add(TimelineUtils.generateFlowRunIdTag(System.currentTimeMillis()));
appContext.setApplicationTags(tags);
![Page 28: Timeline Service v.2 (Hadoop Summit 2016)](https://reader031.fdocuments.us/reader031/viewer/2022030314/58890c221a28ab4a5c8b4f0f/html5/thumbnails/28.jpg)
SeIng up Timeline Service v.2• Set up the HBase cluster (1.1.x)
• Add the Smeline service jar to HBase
• Install the flow run coprocessor
• Create tables via TimelineSchemaCreator uSlity
• Configure the YARN cluster
• Enable Timeline Service v.2
• Add hbase-site.xml for the Smeline collector and readers
• Start the Smeline reader daemon
![Page 29: Timeline Service v.2 (Hadoop Summit 2016)](https://reader031.fdocuments.us/reader031/viewer/2022030314/58890c221a28ab4a5c8b4f0f/html5/thumbnails/29.jpg)
Milestone 1 ("Alpha 1")• Merge discussion (YARN-2928) in progress as we speak! ✓ Complete end-to-end read/write
flow
✓ Real Sme applicaSon and flow aggregaSon
✓ New enSty model
✓ HBase Storage
✓ Rich REST API
✓ IntegraSon with Distributed Shell and MapReduce
✓ YARN generic events and system metrics
![Page 30: Timeline Service v.2 (Hadoop Summit 2016)](https://reader031.fdocuments.us/reader031/viewer/2022030314/58890c221a28ab4a5c8b4f0f/html5/thumbnails/30.jpg)
Milestones - Future• Milestone 2 (“Alpha 2”)
• IntegraSon with new YARN UI
• IntegraSon with more frameworks
• Beta
• Freeze API and storage schema
• Security
• Collectors as containers
• Storage fault tolerance
• ProducSon-ready
• MigraSon-ready
![Page 31: Timeline Service v.2 (Hadoop Summit 2016)](https://reader031.fdocuments.us/reader031/viewer/2022030314/58890c221a28ab4a5c8b4f0f/html5/thumbnails/31.jpg)
Demo
![Page 32: Timeline Service v.2 (Hadoop Summit 2016)](https://reader031.fdocuments.us/reader031/viewer/2022030314/58890c221a28ab4a5c8b4f0f/html5/thumbnails/32.jpg)
Contributors• Li Lu, Junping Du, Vinod Kumar Vavilapalli (Hortonworks)
• Varun Saxena, Naganarasimha G. R. (Huawei)
• Sangjin Lee, Vrushali Channapa5an, Joep RoInghuis (Twi5er)
• Zhijie Shen (now at Facebook)
• The HBase and Phoenix community!
![Page 33: Timeline Service v.2 (Hadoop Summit 2016)](https://reader031.fdocuments.us/reader031/viewer/2022030314/58890c221a28ab4a5c8b4f0f/html5/thumbnails/33.jpg)
Thank you!