From Rivulets to Rivers: Elastic Stream Processing in Heron

From Rivulets to Rivers:Elastic Stream Processing in Heron

Bill Graham , Twitter - @billgrahamAshvin Agrawal, MicrosoftAvrilia Floratou, Microsoft

Prediction is very difficult, especially if it’s about the future.- Nils Bohr

We cannot direct the wind, but we can adjust the sails.- Dolly Parton

Outline

Heron Overview

Elastic Scaling Opportunities

Current Implementation

Work in Progress – Auto-scaling

A realtime, distributed, fault-tolerant stream processing engine.

About Heron Developed by Twitter in 2014 Open sourced in May 2016 Storm API compatible Isolation at all levels:

- Topology- Container- Task (process-based)

At least once, at most once semantics Backpressure Low resource overhead (< 10%)

Logical Topology

Spout 1

Spout 2

Bolt 1

Bolt 2

Bolt 3

Bolt 4

Bolt 5

Physical Execution

Spout 1

Spout 2

Bolt 1

Bolt 2

Bolt 3

Bolt 4

Bolt 5

Packing Plan How to distribute instances onto containers? IPacking.pack()

Topology SubmissionContainer 2

Container 0

Stream Manager

S1 S2 B3

B4 B5 B6

TopologyMaster

HeronClient

HeronScheduler

Container 3

Stream Manager

S1 B2 B3

B4 B5 B6

Container 1

Stream Manager

S1 S2 B3

B4 B5 B6

heron submit

PackingPlan

Instances RegisterStream Manager RegistersData Flows

Containers AllocatedProcesses Initialize

Data Rate Variations

Parallelism Challenges

Anticipating component parallelism is difficult

Changing parallelism is costly - O(hour)

- code change, review, merge, build, kill, submit

Tuning for load spikes or valleys is manual - O(day)

Under-provisioning leads to back pressure leads to support costs

Over-provisioning is the norm

Over-provisioning

CPU Used

CPU Requested

Elastic Scaling Opportunity

Reduce administration cost

Reduce support cost

Reduce hardware cost

Provide better SLA

User Tasks Heron System Tasks

Ordinary Topology Management Process

Kill Topology

Submit Topology

Create Packing

AcquireResource

Monitor / Estimate

Build State

Start Topology

Install Topology

Time Consuming Tasks

ReleasesResources

Low-cost Topology “update”

32 2 34 4

User Tasks Heron System Tasks

Optimized Topology Scale-up Process

KillTopology

Submit Topology

Create Packing

AcquireResources

Monitor / Estimate

Build State

Start Topology

Install Topology

Update Topology

PauseTopology

Un-PauseTopology

Add / Reduce

ResourcesPrepare

Components

heron “update” …

Minimizes Disruption

Aggressively Prunes Containers

Aims to Maintain Uniform Component Distribution

$ heron update my_cluster/user/dev MyTopology \--component-parallelism=bolt1:20 \--component-parallelism=bolt2:40

Available in 0.14.5

Execution Time O(mins)

Customizable Through IRepacking.repack()

Current Limitations

Automated state transition not yet supported

- Component scaling event notification : IUpdatable.update()

- Example: KafkaSpout queue partition mappings

Fields group routing might change

- Workaround: pause topology > cache flush interval before scaling

Algorithmic Auto-Scaling

User Tasks Heron System TasksUser Tasks Heron System Tasks

Algorithmic Auto-Scaling …

Submit Topology

Create Packing

AcquireResources

Monitor / Estimate

Build State

Start Topology

Install Topology

Update Topology

PauseTopology

Un-PauseTopology

Add / Reduce

ResourcesPrepare

Components

Auto-Scaling

Heron should automatically identify

variations in the incoming load and

react to them.

Dhalion periodically observes the state of the topology and determines

whether resources should be scaled up or

Heron uses Dhalion to adjust to external shocks.

Dhalion is a framework that provides

self-regulating capabilities to Heron and will be

open-sourced in the near future.

Using Dhalion to Auto-Scale

Pending Packets Detector

Backpressure Detector

Processing Rate Skew Detector

Resource Underprovisioning

Diagnoser

Data Skew Diagnoser

Slow Instances Diagnoser

Resolver Invocation

Diagnosis

Symptom Detection Diagnosis Generation

Bolt Scale Down

Resolver

Bolt Scale Up Resolver

Restart Instances Resolver

Resource Overprovisioning

Diagnoser

Data Skew Resolver

Resolution

Metrics

Dhalion’s scales up and down the topology resources as needed while still keeping the topology in a steady state where backpressure is not observed

Initial Results

0 7 14 21 28 35 42 49 56 63 70 77 84 91 98 105 112 119

1.40Spout Splitter Bolt

Time (in minutes)

put Scale

Scale Up

S3Dhalion is able to adjust the

topology resources on-the-fly when workload spikes occur.

Our policy eventually reaches a healthy state where

backpressure is not observed and the overall throughput is

maximized.

Future Plans

Use Dhalion to enforce throughput and latency

SLOsand to auto-tune Heron

topologies.

Open-source Dhalion and the auto-scaling

policy as part of Heron.

Combine scaling with stateful stream

processing.

Get Involved

http://github.com/twitter/heron

http://heronstreaming.io

@heronstreaming

Up Next

Anomaly detection in real-time data streams using Heron

Arun Kejariwal, Machine ZoneKarthik Ramasamy, Twitter

Questions?

From Rivulets to Rivers: Elastic Stream Processing in Heron

Internet

Transcript of From Rivulets to Rivers: Elastic Stream Processing in Heron

Datalogic Heron d130

Heron Resources · 6/8/2016 · Heron Resources Heron Resources LimitedLimited Community Consultation Committee June 2016 4 Heron Resources Activities Completed Feasibility Study

The Heron Mapping Client

Heron Tower 2

Blue Heron Biotechnology

Rivulets of the Absolute Prf10 - 1106 Design...xii xiii Rivulets of the Absolute Blessings From the Master own proprioceptive language and learn to rely upon the inner guru. This is

Heron Family

Heron Instruments - Maintenance Guide for H.01L and Sm.Oil Oieonpro.americommerce.com/manuals/Heron-InterfaceManual.pdf · Heron Instruments - Maintenance Guide for H.01L and Sm.Oil

Blue Heron Chapter of the Sumi-e Society of America Blue ... Heron Dec. 2015 newsletter.pdf · Blue Heron Chapter of the Sumi-e Society of America Blue Heron News ... Goodman Twenty

Bushcamp Company African Darter Reed Cormorant Hamerkop Goliath Heron. Grey Heron Black-headed Heron . Great White Egret . Cattle Egret . Squacco Heron..... Green-backed Heron. Dwarf

Blue heron 1

enn with - I T.A.K.E. (Un) confitakeunconf.com/.../Automate-all-things-AWS-with-Ansible.pdfAWS WITH ANSIBLE elastic elastic elastic iii elastic elastic elastic elastic elastic elastic

HERON PARK BAPTIST CHURCH

BLUE HERON NEWSLETTER - Home - Blue Heron School

Taitu Heron - Globalization, Neoliberalism,

CDC – The Heron

Twitter Heron

Blue Heron Tavern Pyrography

Heron Pool Hoist_2015

Heron Bay Singapore