Triggerflow: Trigger-based Orchestration of Serverless ...

12
CC-BY 4.0. is is the author’s preprint version of the camera-ready article. A full version of this paper is published in the proceedings of 14th ACM International Conference on Distributed and Event-Based Systems (DEBS 2020). Triggerflow: Trigger-based Orchestration of Serverless Workflows Pedro Garc´ ıa L´ opez IBM Watson Research New York, USA [email protected] Aitor Arjona Universitat Rovira i Virgili Tarragona, Spain [email protected] Josep Samp´ e Universitat Rovira i Virgili Tarragona, Spain [email protected] Aleksander Slominski IBM Watson Research New York, USA [email protected] Lionel Villard IBM Watson Research New York, USA [email protected] ABSTRACT As more applications are being moved to the Cloud thanks to server- less computing, it is increasingly necessary to support native life cycle execution of those applications in the data center. But existing systems either focus on short-running workows (like IBM Composer or Amazon Express Workows) or impose considerable overheads for synchronizing massively parallel jobs (Azure Durable Functions, Amazon Step Functions, Google Cloud Composer). None of them are open systems enabling extensible interception and optimization of custom workows. We present Triggerow: an extensible Trigger-based Orchestra- tion architecture for serverless workows built on top of Knative Eventing and Kubernetes technologies. We demonstrate that Trig- gerow is a novel serverless building block capable of construct- ing dierent reactive schedulers (State Machines, Directed Acyclic Graphs, Workow as code). We also validate that it can support high-volume event processing workloads, auto-scale on demand and transparently optimize scientic workows. KEYWORDS event-based, orchestration, serverless 1 INTRODUCTION Serverless Function as a Service (FaaS) is becoming a very popular programming model in the cloud thanks to its simplicity, billing model and inherent elasticity. e FaaS programming model is considered event-based, since functions are activated (triggered) in response to specic Cloud Events (like a state change in a disaggre- gated object store like Amazon S3). e FaaS model has also proven ideally suited (PyWren [14], ExCamera [9]) for executing embarrassingly parallel computing tasks. But both PyWren and ExCamera required their own ad-hoc external orchestration services to synchronize the parallel execu- tions of functions. For example, when the PyWren client launches a map job with N functions, it waits and polls Amazon S3 until all the results are received in the S3 bucket. ExCamera also relied on an external Rendezvous server to synchronize the parallel executions. Lambda creator Tim Wagner recently outlined [24] that Cloud providers must oer new serverless building blocks to applications. In particular, he foresees new services like ne-grained, low-latency orchestration, execution data ows, and the ability to customize code and data at scale to support the emerging data-intensive ap- plications over Serverless Functions. e reality is that existing serverless orchestration systems are not designed for long-running data analytics tasks [3, 18]. Either they are focused on short-running highly interactive workows (Amazon Express Workows, IBM Composer) or impose consider- able overheads for synchronizing massively parallel jobs (Azure Durable Functions, Amazon Step Functions, Google Cloud Com- poser). We present Triggerow, a novel building block for composing event-based services. As more applications are moved to the Cloud, this service will enable to control the life-cycle of those applications in a reactive and extensible way. e exibility of the system can also be used to transparently optimize the execution of tasks in reaction to events. e major contributions of this paper are the following: (1) We present a Rich Trigger framework following an Event- Condition-Action (ECA) architecture that is extensible at all levels (Event Sources and Programmable Conditions and Actions). Our architecture ensures that composite event detection and event routing mechanisms are mediated by reactive event-based middleware. (2) We demonstrate Triggerow’s extensibility and universal- ity creating atop it a state machine workow scheduler, a DAG engine, an imperative Workow as Code (using event sourcing) scheduler, and integration with an exter- nal scheduler like PyWren. We also validate performance and overhead of our scheduling solutions compared to existing Cloud Serverless Orchestration systems like Ama- zon Step Functions, Amazon Express Workows, Azure Durable Functions and IBM Composer. (3) We nally propose a generic implementation of our model over standard CNCF Cloud technologies like Kubernetes, Knative Eventing and CloudEvents. We validate that our system can support high-volume event processing work- loads, auto-scale on demand and transparently optimize sci- entic workows. e project is available as open-source in [1]. 1 arXiv:2006.08654v2 [cs.DC] 17 Jun 2020

Transcript of Triggerflow: Trigger-based Orchestration of Serverless ...

Page 1: Triggerflow: Trigger-based Orchestration of Serverless ...

CC-BY 4.0. �is is the author’s preprint version of the camera-ready article. A full version of this paper is published in the proceedings of 14th ACMInternational Conference on Distributed and Event-Based Systems (DEBS 2020).

CC-BY 4.0. �is is the author’s preprint version of the camera-ready article. A full version of this paper is published in the proceedings of 14th ACMInternational Conference on Distributed and Event-Based Systems (DEBS 2020).

CC-BY 4.0. �is is the author’s preprint version of the camera-ready article. A full version of this paper is published in the proceedings of 14th ACMInternational Conference on Distributed and Event-Based Systems (DEBS 2020).

Triggerflow: Trigger-based Orchestration of ServerlessWorkflows

Pedro Garcıa LopezIBM Watson Research

New York, [email protected]

Aitor ArjonaUniversitat Rovira i Virgili

Tarragona, [email protected]

Josep SampeUniversitat Rovira i Virgili

Tarragona, [email protected]

Aleksander SlominskiIBM Watson Research

New York, [email protected]

Lionel VillardIBM Watson Research

New York, [email protected]

ABSTRACTAs more applications are being moved to the Cloud thanks to server-less computing, it is increasingly necessary to support native lifecycle execution of those applications in the data center.

But existing systems either focus on short-running work�ows(like IBM Composer or Amazon Express Work�ows) or imposeconsiderable overheads for synchronizing massively parallel jobs(Azure Durable Functions, Amazon Step Functions, Google CloudComposer). None of them are open systems enabling extensibleinterception and optimization of custom work�ows.

We present Trigger�ow: an extensible Trigger-based Orchestra-tion architecture for serverless work�ows built on top of KnativeEventing and Kubernetes technologies. We demonstrate that Trig-ger�ow is a novel serverless building block capable of construct-ing di�erent reactive schedulers (State Machines, Directed AcyclicGraphs, Work�ow as code). We also validate that it can supporthigh-volume event processing workloads, auto-scale on demandand transparently optimize scienti�c work�ows.

KEYWORDSevent-based, orchestration, serverless

1 INTRODUCTIONServerless Function as a Service (FaaS) is becoming a very popularprogramming model in the cloud thanks to its simplicity, billingmodel and inherent elasticity. �e FaaS programming model isconsidered event-based, since functions are activated (triggered) inresponse to speci�c Cloud Events (like a state change in a disaggre-gated object store like Amazon S3).

�e FaaS model has also proven ideally suited (PyWren [14],ExCamera [9]) for executing embarrassingly parallel computingtasks. But both PyWren and ExCamera required their own ad-hocexternal orchestration services to synchronize the parallel execu-tions of functions. For example, when the PyWren client launches amap job with N functions, it waits and polls Amazon S3 until all theresults are received in the S3 bucket. ExCamera also relied on anexternal Rendezvous server to synchronize the parallel executions.

Lambda creator Tim Wagner recently outlined [24] that Cloudproviders must o�er new serverless building blocks to applications.In particular, he foresees new services like �ne-grained, low-latencyorchestration, execution data �ows, and the ability to customize

code and data at scale to support the emerging data-intensive ap-plications over Serverless Functions.

�e reality is that existing serverless orchestration systems arenot designed for long-running data analytics tasks [3, 18]. Eitherthey are focused on short-running highly interactive work�ows(Amazon Express Work�ows, IBM Composer) or impose consider-able overheads for synchronizing massively parallel jobs (AzureDurable Functions, Amazon Step Functions, Google Cloud Com-poser).

We present Trigger�ow, a novel building block for composingevent-based services. As more applications are moved to the Cloud,this service will enable to control the life-cycle of those applicationsin a reactive and extensible way. �e �exibility of the system canalso be used to transparently optimize the execution of tasks inreaction to events.

�e major contributions of this paper are the following:(1) We present a Rich Trigger framework following an Event-

Condition-Action (ECA) architecture that is extensible atall levels (Event Sources and Programmable Conditions andActions). Our architecture ensures that composite eventdetection and event routing mechanisms are mediated byreactive event-based middleware.

(2) We demonstrate Trigger�ow’s extensibility and universal-ity creating atop it a state machine work�ow scheduler,a DAG engine, an imperative Work�ow as Code (usingevent sourcing) scheduler, and integration with an exter-nal scheduler like PyWren. We also validate performanceand overhead of our scheduling solutions compared toexisting Cloud Serverless Orchestration systems like Ama-zon Step Functions, Amazon Express Work�ows, AzureDurable Functions and IBM Composer.

(3) We �nally propose a generic implementation of our modelover standard CNCF Cloud technologies like Kubernetes,Knative Eventing and CloudEvents. We validate that oursystem can support high-volume event processing work-loads, auto-scale on demand and transparently optimize sci-enti�c work�ows. �e project is available as open-sourcein [1].

1

arX

iv:2

006.

0865

4v2

[cs

.DC

] 1

7 Ju

n 20

20

Page 2: Triggerflow: Trigger-based Orchestration of Serverless ...

CC-BY 4.0. �is is the author’s preprint version of the camera-ready article. A full version of this paper is published in the proceedings of 14th ACMInternational Conference on Distributed and Event-Based Systems (DEBS 2020).

CC-BY 4.0. �is is the author’s preprint version of the camera-ready article. A full version of this paper is published in the proceedings of 14th ACMInternational Conference on Distributed and Event-Based Systems (DEBS 2020).

CC-BY 4.0. �is is the author’s preprint version of the camera-ready article. A full version of this paper is published in the proceedings of 14th ACMInternational Conference on Distributed and Event-Based Systems (DEBS 2020).

2 RELATEDWORKFaaS is based on the event-driven programming model. In fact,many event-driven abstractions like triggers, Event Condition Ac-tion (ECA) and even composite event detection were already in-spired by the veteran Active Database Systems [21].

Event-based triggering has also been extensively employed in thepast to provide reactive coordination of distributed systems [11, 20].Event-based mechanisms and triggers have also been extensivelyused [4, 6, 10, 17] in the past to build work�ows and orchestrationsystems. �e ECA model including trigger and rules �ts nicely tode�ne the transitions of �nite state machines representing work-�ows. In [7], they propose to use synchronous aggregation triggersto coordinate massively parallel data processing jobs.

An interesting related work is [17]. �ey leverage composite sub-scriptions in content-based publish/subscribe systems to providedecentralized Event-based Work�ow Management. �eir PADRESsystem supports parallelization, alternation, sequence, and rep-etition compositions thanks to content-based subscriptions in aComposite Subscription Language.

More recently, a relevant article [22] has surveyed the intersec-tions of the Complex Event Processing (CEP) and Business ProcessManagement (BPM) communities. �ey clearly present the existingchallenges to combine both models and describe recent e�orts inthis area. We outline that our paper is in line with their challenge“Executing business processes via CEP rules”, and our novelty hereis our serverless reactive and extensible architecture.

In serverless se�ings, the more relevant related work aiming toprovide reactive orchestration of serverless functions is the Server-less trilemma [2] from IBM. In their paper, the authors advocate forreactive run-time support for function orchestration, and present asolution for sequential compositions on top of Apache OpenWhisk.

A plethora of academic works are proposing di�erent so-calledserverless orchestration systems like [5, 8, 13, 15, 19, 23]. However,most of them rely on non-serverless services like VMs or dedicatedresources, or they use functions calling functions pa�erns whichcomplicate their architectures and fault tolerance. None of themo�er extensible trigger abstractions to build di�erent schedulers.

All Cloud providers are now o�ering cloud orchestration andfunction composition services like IBM Composer, Amazon StepFunctions, Azure Durable Functions, or Google Cloud Composer.

IBM Composer service is in principle designed for short-runningsynchronous composition of serverless functions. IBM Composergenerates a state machine representation of the work�ow to beexecuted with IBM Cloud Functions. It can represent sequences,conditional branching, loops, parallel, and map tasks. However,fork/join synchronization (map, parallel) blocks on an external user-provided Redis service, limiting their applicabillity to short runningtasks.

Amazon o�ers two main services: Amazon Step Functions (ASF)and Amazon Step Functions Express Work�ows (ASFE). �e Ama-zon States Language (based on JSON) permits to model task transi-tions, choices, waits, parallel, and maps in a standard way. ASF is afault-tolerant managed service designed to support long-runningwork�ows and ASFE is designed for short-running (less than �veminutes) highly intensive workloads with relaxed fault-tolerance.

Microso�’s Azure Durable Functions (ADF) represents work-�ows as code using C# or Javascript, leveraging async/await con-structs and using event sourcing to replay work�ows that have beensuspended. ADF does not support map jobs explicitly, and onlyincludes a Task.whenAll abstraction enabling fork/join pa�erns fora group of asynchronous tasks.

Google o�ers Google Cloud Composer service leveraging a man-aged Apache Air�ow cluster. Air�ow represents work�ows in aDAG (Directed Acyclic Graph) coded in Python, so that it cannotsupport cycles. It is not ideally suited for parallel jobs or high-volume work�ows, and it is not designed for orchestrating server-less functions.

Two previous papers [3, 18] have compared public FaaS orches-tration services for coordinating massively parallel workloads. Inthose studies, IBM Composer o�ered the fastest performance andreduced overheads to execute map jobs whereas ASF or ADF im-posed considerable overheads. We will also show in this paper howASFE obtains good performance for parallel workloads.

None of the existing cloud orchestration services is o�eringan open and extensible trigger-based API enabling the creationof custom work�ow engines. We demonstrate in this paper thatwe can use Trigger�ow to implement existing models like ASF orAir�ow DAGs. Trigger�ow is not just another scheduler, but areactive meta-tool to build reactive schedulers leveraging Knativestandard technologies.

2.1 Cloud Event Routing and Knative EventingEvent-based architectures are gaining relevance in Cloud providersas a unifying infrastructure for heterogeneous cloud services andapplications. Event services participate in the entire cloud controlloop from event production in event sources, to event detectionusing monitoring services, to event logging and data analytics ofexisting event work�ows, and �nally to service orchestration andevent reaction thanks to appropriate �ltering mechanisms.

�e trend is to create cloud event routers, specialized rule-basedmulti-tenant services, capable of �ltering and triggering selectedtargets in the Cloud in response to events. Amazon is o�ering Event-Bridge, Azure o�ers EventGrid, and Google and IBM are investingin the open Knative Eventing project and CNCF CloudEvents stan-dard.

�e Knative project was created to provide streamlined serverless-like experience for developers using Kubernetes. It contains a setof high-level abstractions related to scalable functions (KnativeServing) and event processing (Knative Eventing) that allows thedescription of asynchronous, decoupled, event-driven applicationsbuilt out of event sources, sinks, channels, brokers, triggers, �lters,sequences, etc.

�e goal of Knative is to allow developers to build cloud nativeevent-driven serverless applications on those abstractions. �evalue of Knative is to encapsulate well tested best practices in high-level abstractions that are native to Kubernetes: custom resourcede�nitions (CRDs) for new custom resources (CRs) such as eventsources. Abstractions allow developers to describe event-drivenapplication components and have late-binding to underlying (pos-sibly multiple) messaging and eventing systems like Apache Ka�aand NATS among others.

2

Page 3: Triggerflow: Trigger-based Orchestration of Serverless ...

CC-BY 4.0. �is is the author’s preprint version of the camera-ready article. A full version of this paper is published in the proceedings of 14th ACMInternational Conference on Distributed and Event-Based Systems (DEBS 2020).

CC-BY 4.0. �is is the author’s preprint version of the camera-ready article. A full version of this paper is published in the proceedings of 14th ACMInternational Conference on Distributed and Event-Based Systems (DEBS 2020).

CC-BY 4.0. �is is the author’s preprint version of the camera-ready article. A full version of this paper is published in the proceedings of 14th ACMInternational Conference on Distributed and Event-Based Systems (DEBS 2020).

Trigger�ow aims to leverage existing event routing technology(Knative Eventing) to enable extensible trigger-based orchestrationof serverless work�ows. Trigger�ow includes advanced abstrac-tions not present in Knative Eventing like dynamic triggers, triggerinterception, custom �lters, termination events, and a shared con-text among others. Some of these novel services may be adopted inthe future by event routing services to make it easier to compose,stream, and orchestrate tasks.

3 TRIGGERFLOW ARCHITECTUREWe can see in Figure 1 an overall diagram of the Trigger�ow Archi-tecture. �e Trigger service follows an extensible Event-Condition-Action architecture. �e service can receive events from di�er-ent Event Sources in the Cloud (Ka�a, RabbitMQ, Object Storage,timers). It can execute di�erent types of Actions (containers, Func-tions, VMs). And it can also enable the creation of custom �lters orConditions from third-parties. �e Trigger service also provides ashared persistent context repository providing durability and faulttolerance.

Figure 1 also shows the basic API exposed by TriggerFlow: cre-ateWork�ow initializes the context for a given work�ow, addTriggeradds a new trigger (including event, conditions, actions, and con-text), addEventSource permits the creation of new event sources,and getState obtains the current state associated to a given triggeror work�ow.

Di�erent applications and schedulers can bene�t from serverlessawakening and rich triggering by using this API to build di�erentorchestration services like Air�ow-like DAGs, ASF state machinesor Work�ow as Code clients like PyWren.

3.1 Design goalsLet’s establish a number of design goals that must be supported inthe proposed architecture:

(1) Support for Heterogeneous Work�ows: �e main idea is tobuild a generic building block for di�erent types of sched-ulers. �e system should support enterprise work�owsbased on Finite State Machines, Directed Acyclic Graphs,and Work�ow as Code systems.

(2) Extensibility and Computational Re�ection: �e systemmust be extensible enough to support the creation of novelwork�ow systems with special requirements like special-ized scienti�c work�ows. �e system must support intro-spection and interception mechanisms enabling the moni-toring and optimization of existing work�ows.

(3) Serverless design: �e system must be reactive, and onlyexecute logic in response to events, like state transitions.Serverless design also entails pay per use, �exible scaling,and dependability.

(4) Performance: �e system should support high-volumeworkloads like data analytics pipelines with numerousparallel tasks. �e system should exhibit low overheadsfor both short-running and long-running work�ows.

3.2 Trigger serviceOur proposal is to design a purely event-driven and reactive archi-tecture for work�ow orchestration. Like previous works [4, 6, 10],

we also propose to handle state transitions using event-based trig-gering mechanisms. �e novelty of our approach precisely relieson the aforementioned design goals: support for heterogeneouswork�ows, extensibility, serverless design, and performance forhigh volume workloads.

We follow an Event Condition Action architecture in whichtriggers (active rules) de�ne which action must be launched inresponse to Events or to Conditions evaluated over one or moreEvents. �e system must be extensible at all levels: Events, Con-ditions, and Actions. Let us introduce some de�nitions for ourevent-based orchestration model.

De�nition 1. Work�ow: We can represent a work�ow as aFinite State Machine (FSM) being a 6-tuple withM = (

∑in ,Ctx , S, s, F ,δ ), in this 6-tuple:

(1)∑in : the set of input events

(2) Ctx: the set of context variables(3) S: the set of states which map to Actions in the ECA model(4) s: initial state, linked to an initial event(5) F: end state, linked to a �nal Termination event(6) δ : state-transition function: δ : S ×∑→ S , based on the

ECA triggers

De�nition 2. Trigger (δ ): can be de�ned as the state transitionfunction. �e trigger is a 4-tuple with (Event, Context, Condition,Action) that moves one state to the following when the condition oninput events holds. In this case, the trigger launches the appropriateaction which corresponds to the next state. Each action will in turn�re events that may be captured by another trigger. Triggers canbe transient and dynamic (activated on demand) or persistent ifthey remain always active. Its components are:

• Event: Events are the atomic piece of information thatdrive �ows in Cloud applications. We rely on the standardCNCF CloudEvents version 1.0 speci�cation to representevents. To match an event to its trigger, the subject andtype �elds of a CloudEvent are used. We use the subject�eld to match the event to its corresponding trigger, andthe type �eld to describe the type of the event. Terminationand failure events use this type �eld to notify success orfailure.

• Context: �e context is a fault-tolerant key-value datastructure that contains the state of the trigger during itslifetime. It is also used to introspect the current triggerdeployment, to modify the state of other triggers or todynamically activate/deactivate triggers.

• Condition: Conditions are active rules (user-de�ned code)that �lter events to decide if they match in order to launchthe corresponding action. Conditions evaluate rules overprimitive events (single) or over composite (group) events.Composite event information like counters may be storedin the Context. Conditions produce a boolean result thatrepresents whether the trigger has to be �red or not.

• Action: Actions are the computations (user-de�ned code)launched in response to matching Conditions in a trigger.An Action can be a serverless function, VM or container.When it is executed, we consider that the trigger has been�red.

3

Page 4: Triggerflow: Trigger-based Orchestration of Serverless ...

CC-BY 4.0. �is is the author’s preprint version of the camera-ready article. A full version of this paper is published in the proceedings of 14th ACMInternational Conference on Distributed and Event-Based Systems (DEBS 2020).

CC-BY 4.0. �is is the author’s preprint version of the camera-ready article. A full version of this paper is published in the proceedings of 14th ACMInternational Conference on Distributed and Event-Based Systems (DEBS 2020).

CC-BY 4.0. �is is the author’s preprint version of the camera-ready article. A full version of this paper is published in the proceedings of 14th ACMInternational Conference on Distributed and Event-Based Systems (DEBS 2020).

Trigger service worker

RESTfulAPI

createWorkflow

addEventSource

addTrigger

getContext

PersistentStorage

EventSourceSubscribers KafkaSource RedisStreamsSource SQSSource

Worker Event Sink

Global Context

Triggers

Match activation event with trigger

Event-Trigger Processing

Condition

TriggerContext

Action

UpdateState /

checkpoint

CommitEvent

[true]

Google Composer-likeDAGs Interface

Amazon StepFunctions Interface

PyWren imperativeInterface

User defined container or default Python functions

Events

Figure 1: Trigger�ow Architecture

De�nition 3. Mapping work�ow to triggers: A work�owcan be mapped to a set of Triggers (∆) which contains all statetransitions (δ triggers) in the State Machine.

We will show in next sections how di�erent work�ows (AmazonStep Functions) and Directed Acyclic Graphs (Apache Air�ow) canbe transformed to a set of triggers (∆), which is the informationneeded by the Trigger service to orchestrate them. For example, totransform a DAG into triggers, a trigger is added for every edge(work�ow transition) of the graph. In a DAG, every node has itsown unique ID, so the termination event from a task will containas subject its ID to �re the trigger that handles its termination andinvokes the next step in the work�ow.

De�nition 4. Substitution principle: A Work�ow must com-ply with an Action according to triggering (initialization) and �nal-ization (Termination Event). A homogeneous treatment of Work-�ows and Actions permits nested work�ow composition and itera-tions.

De�nition 5. Dynamic Trigger interception: Any triggercan be intercepted dynamically and transparently to execute adesired action. Interception code is also performed with triggers.It must be possible to intercept triggers by condition identi�eror by trigger identi�er. �e condition identi�er represents eachexisting condition in Trigger�ow, for example a map condition thataggregates all events in a parallel invocation. �e trigger identi�errepresents the unique ID that each trigger receives on creation.

We can introspect work�ows, triggers, conditions, and actionsusing the Context. And we can intercept any trigger in the systemin a transparent way using the Rich Trigger API. �is opens thesystem to customize code and data in a very granular way.

4 PROTOTYPE IMPLEMENTATIONWe have developed two di�erent implementations of Trigger�ow:one over Knative, which follows a push-based mechanism to passthe events from the event source to the appropriate worker, andanother one using Kubernetes Event-driven Autoscaling (KEDA),

where the worker follows a pull-based mechanism to retrieve theevents directly from the event source. We created the prototypeson top of the IBM Cloud infrastructure, leveraging the services inits catalog to deploy the di�erent components of our architecture.�ese components are the following:

• A Front-end RESTful API, where a user connects to interactwith Trigger�ow.• A Database, responsible for storing work�ow information,

such as triggers, context, etc.• A Controller, responsible for creating the work�ow work-

ers in Kubernetes.• �e work�ow workers (TF-Worker herea�er), responsible

for processing the events by checking the triggers’ condi-tions, and applying the actions.

In our implementation, each work�ow has its own TF-Worker. Inother words, the scalability of the system is provided at work�ow-level and not at TF-Worker level. In the validation (Sec. 6), wedemonstrate how each TF-Worker provides enough event ingestionrate to process large amounts of events per second.

In our system, the events are logically grouped in what we callwork�ows. �e work�ow abstraction is useful, for example, to dif-ferentiate and isolate the events from multiple work�ows, allowingto share a common context among the (related) events.

4.1 Deployment on KnativeWe mainly bene�t from the Knative auto scaler component in Kna-tive Serving and the routing/�ltering service in Knative Eventing.

Any serverless reactive architecture requires a managed multi-tenant component that is constantly running, monitoring eventsources, and only launching actions in response to speci�c events.In this way, the tenant only pays for the execution of actions inresponse to events, and not for the constant monitoring of eventservices. For example, in OpenWhisk, when we create a trigger fora Function (like an Object Storage trigger), the system is in charge

4

Page 5: Triggerflow: Trigger-based Orchestration of Serverless ...

CC-BY 4.0. �is is the author’s preprint version of the camera-ready article. A full version of this paper is published in the proceedings of 14th ACMInternational Conference on Distributed and Event-Based Systems (DEBS 2020).

CC-BY 4.0. �is is the author’s preprint version of the camera-ready article. A full version of this paper is published in the proceedings of 14th ACMInternational Conference on Distributed and Event-Based Systems (DEBS 2020).

CC-BY 4.0. �is is the author’s preprint version of the camera-ready article. A full version of this paper is published in the proceedings of 14th ACMInternational Conference on Distributed and Event-Based Systems (DEBS 2020).

KEDAController

RESTfulAPI

TriggerflowWorker

(TF-Worker)

TriggerflowWorker

(TF-Worker)

TriggerflowWorker

(TF-Worker)Database

 

Actor

AWS SQS GCP Pub/Sub Apache Kafka

1

2

3

4

7

6workflow

-1w

orkflow-2

workflow

-3

TriggerflowController

5

Redis Streams

Figure 2: Prototype deployment on KEDA

of monitoring the event source and only launching the function inresponse to events.

In Knative Eventing, each tenant will have an Event Sourcethat receives all events they are interested in (and have accessto). We register a Knative Eventing trigger for each work�ow inthe system. �e �ltering capabilities of Knative Eventing’s triggerpermit to route events of this work�ow to the appropriate TF-Worker (Condition).

Each work�ow event is tagged with a unique work�ow identi�er.We have created a customized functions runtime, which generatesfunction termination events to the desired message broker thatinclude the selected work�ow identi�er. If Trigger�ow must receiveevents from services which do not include this work�ow ID, ageneric �ltering service will match conditions to the incomingevent (like “all events of this object storage bucket belong to thiswork�ow”), tag the event, and route it to the tenant’s Event Source.

As each event contains a unique identi�er per work�ow, it is easyfor Knative eventing to route this event to the selected TF-Worker.�e TF-Worker is then launched by Knative Serving to process theevent, but it will also scale to zero if no more events are producedin a period. �is ensures the serverless scale to zero and pay as yougo qualities for our Trigger�ow service. �e TF-Worker accesseswork�ow state in the Context persistent store, which is also usedfor checkpointing and fault tolerance.

Regarding fault tolerance, Knative Eventing guarantees ”atleast once” message delivery, and automatic detection and restartof failed workers. If a TF-Worker fails, the persistent Context willrestore the state in a consistent manner a�er the failure. �e persis-tent Context is also used for stateful Conditions, like aggregationfork-join triggers that perform composite event detection and eventcounting.

4.2 Deployment on KEDAOne of the hardest problems in event-driven applications is to dealwith reliability and scalability. Event systems may be receivingevents as soon as they are created (”pushed”) or they may processthem when they are ready (”pull” or ”poll”) and for both cases theyneed to deal with capacity limits and error handling. Knative isvery well suited for push-based scaling as it can auto-scale basedon incoming HTTP requests containing events. Kubernetes Event-driven Autoscaling (KEDA) is the best option now for event-basedcon�gurable pull-based scaling.

We have also implemented Trigger�ow entirely on top of Ku-bernetes using the KEDA project [16]. KEDA o�ers pull-basedcon�gurable event queue monitoring and reactive scalable instan-tiation of Kubernetes containers. KEDA also o�ers con�gurableauto-scaling mechanisms to scale up or down to zero.

In this case, the Trigger�ow Controller integrates KEDA for themonitoring of Event Sources and for launching the appropriate TF-Workers, and scaling them to zero when necessary. It is also possibleto con�gure di�erent parameters in KEDA like the queue pullinginterval, passivation interval, and number of events scaling interval.Di�erent types of work�ows may require di�erent con�gurationparameters.

�e advantage here is that, unlike in Knative Eventing, ourTF-Workers connect directly to the Message Broker (Ka�a, Re-dis Streams) using the native protocol of the broker. �is permits tohandle more events per second in a single pod. As we demonstratein the validation, this allows us to handle intensive workloads fromscienti�c work�ows coordinating parallel jobs over thousands ofserverless functions.

Figure 2 shows a high-level perspective of our implementationusing KEDA. In this deployment, Trigger�ow works as follows:�rough the client, a user must �rstly create an empty work�owto the Trigger�ow registry, and reference an event source that thiswork�ow will use. �en, the user can start adding triggers to it (1).All the information is persisted in the database (for example, Redis)(2). �en, immediately a�er creating the work�ow, the front-endAPI communicates with the Trigger�ow controller (3), deployed as asingle stateless pod container (service) in Kubernetes, to create theauto-scalable TF-Worker in KEDA (4). From this moment, KEDAis responsible to scale up and down the TF-Workers (5). In KEDA,as stated above, the TF-Worker is responsible for communicatingdirectly to the event source (6) to pull the incoming events. Finally,TF-Workers periodically interact with the database (7) to keep thelocal cache of available triggers updated, and to store the context(checkpointing) for fault-tolerance purposes.

Regarding fault tolerance, we also guarantee ”at least once”message delivery and restarting of failed workers. In this case,the TF-Worker uses batching to commit groups of events in theKa�a Event Source once they have been correctly processed. If theTF-Worker fails, Ka�a will just resend the non-commi�ed eventsto the TF-Worker and thus ensuring message delivery.

In our Redis implementation, we use Redis both as event broker(Redis Streams), and as persistent store (for the Context and events).Again, if the TF-Worker fails, all events are in the event store, so itwill continue with the non-processed events.

If Knative Eventing and KEDA communities converge in thenext months, we will be able to deploy Trigger�ow directly ontop of one uni�ed event router technology. It is also possible thatsome building blocks of Trigger�ow could be moved to the KnativeEventing kernel. For example, the Knative Eventing communityis now considering more advanced �ltering mechanisms (complexevent processing). In that case, our TF-Worker could delegate manytasks to the underlying event router.

5

Page 6: Triggerflow: Trigger-based Orchestration of Serverless ...

CC-BY 4.0. �is is the author’s preprint version of the camera-ready article. A full version of this paper is published in the proceedings of 14th ACMInternational Conference on Distributed and Event-Based Systems (DEBS 2020).

CC-BY 4.0. �is is the author’s preprint version of the camera-ready article. A full version of this paper is published in the proceedings of 14th ACMInternational Conference on Distributed and Event-Based Systems (DEBS 2020).

CC-BY 4.0. �is is the author’s preprint version of the camera-ready article. A full version of this paper is published in the proceedings of 14th ACMInternational Conference on Distributed and Event-Based Systems (DEBS 2020).

Task3Task3Task3

Task1

Task2

Activation event: { T1 }Action: Map

Activation event: { $init }Action: Call Async

Activation events: { T2, T3 }Condition: Receiveone term. event fromT2 and N term.events from T3

<<init event>>

<<end>>

Trigger 1

Trigger 2 Trigger 3

<<invoke async>>

Trigger 4

<<Task 1 term. event>>

<<map>>

<<fan-in>>

Figure 3: Triggers that connect the tasks of an example DAG

5 USE CASESTo demonstrate the �exibility that can be achieved using triggerswith programmable conditions and actions, we have implementedthree di�erent work�ow models that use Trigger�ow as the under-lying serverless and scalable work�ow orchestrator.

5.1 Directed Acyclic GraphsWhen a work�ow is described as a Directed Acyclic Graph (DAG),the vertices of the graph represent the tasks of the work�ow andthe edges represent the dependencies between the tasks. �e factthat a DAG does not have cycles implies that there are no cyclicdependencies, which would be impossible to ful�ll.

�e orchestration platforms that rely on DAGs for their work-�ow description, such as Apache Air�ow, handle the dependenciesbetween tasks with their downstream relatives a�ribute, i.e. upon acompletion of a task execution, these orchestrators look for whattasks have to be executed a�er the completed task.

However, from a trigger-based orchestration perspective, it ismore compelling to know what tasks have to be executed beforea certain one, i.e. what are the dependencies of every task, theirupstream relatives. With this information, we can register a triggerto activate a task’s execution when all termination events from itsupstream relatives are present.

To orchestrate a work�ow de�ned as a DAG with triggers, wewill de�ne a trigger for every edge of the DAG:

• As activation events of the trigger, we register the taskIDs that have to be completed before the tasks that theedge points to (their upstream relatives).

• As condition, we count the number of events the triggerhas to aggregate before executing the next task (i.e. a joinof a map execution).

• As action, we register the actual task to be executed, ide-ally an asynchronous task such as an invocation of a server-less function.

To orchestrate a work�ow in this way, it is assumed that a�eran asynchronous task is completed, it will produce a terminationevent containing its ID to activate the trigger that manages the taskexecution that follows it.

To handle a map-join trigger condition, before actually makingthe invocation requests, we use the introspect context feature from

the activated trigger action to dynamically modify the condition ofthe trigger that will aggregate the events, to set the speci�c numberof expected functions to be joined. �is is used in the case that theiterator which we map onto has a variable length depending on thework�ow execution.

Furthermore, this approach gives us the opportunity to handleerrors during a work�ow runtime. Special triggers can be added thatactivate when a task fails, so that the trigger action can handle thetask’s error and halt the work�ow execution until the error is solved.A�er error resolution (retry, skip or try-catch logic), the work�ow’sexecution can be resumed by activating the corresponding triggerthat would have been executed in the �rst place, as if there had notbeen an error.

�e DAGs interface implementation is inspired by Air�ow’sextensible DAG de�nition based on the Operator abstraction. Ac-cording to Air�ow’s core ideas, an Operator describes what is theactual work logic that is carried out by a task. Air�ow o�ers awide variety of operators to work with out of the box, but it canbe extended through the implementation of plugins. �is approachis well suited to Trigger�ow’s architecture, thanks to its �exibleprogrammatic trigger actions and conditions.

To illustrate this approach, Figure 3 depicts how a simple DAGwith call async, maps, and branches is orchestrated using triggers.

5.2 State Machines and Nested Work�owsAmazon Step Functions bases its work�ow description on a statemachine de�ned by a declarative JSON object using the AmazonStates Language DSL.

Similarly to Air�ow’s DAGs, a state machine de�nition in Ama-zon States Language (ASL) only takes into consideration what isthe next state to execute for each of them. However, from a trig-ger perspective, it is needed to �gure out what states need to beexecuted before a given one, so that we can add a trigger that �resupon a state completion and executes the next one. �erefore, therewill be a trigger for every state transition that handles the statemachine �ow logic.

Nevertheless, a distinctive feature that ASL provides is that astate can be a sub-state machine. For instance, the primitives mapand parallel, map and branch to an entire state machine, ratherthan a single task like in the DAG interface. To manage this feature,we need a special event that is produced when a state machineends. For map and branch joins, we will then join those sub-statemachines instead of single tasks. To do so, we identify each sub-state machine with a unique tag in the scope of the execution. Bydoing so, we also comply with the substitution principle of theserverless trilemma.

To produce state machine termination events, we need to activatetriggers from within a trigger action/condition function, as statemachine joining is detected in there. To do so, the worker’s eventsink internal bu�er was made accessible through the context objectso that a trigger action/condition function can produce the eventsthat activate the necessary subsequent triggers.

In an Amazon Step Functions execution, the states can transfertheir output to the input of the following state. To reproduce thisfunctionality, we use the Context of the Work�ow, so that the output

6

Page 7: Triggerflow: Trigger-based Orchestration of Serverless ...

CC-BY 4.0. �is is the author’s preprint version of the camera-ready article. A full version of this paper is published in the proceedings of 14th ACMInternational Conference on Distributed and Event-Based Systems (DEBS 2020).

CC-BY 4.0. �is is the author’s preprint version of the camera-ready article. A full version of this paper is published in the proceedings of 14th ACMInternational Conference on Distributed and Event-Based Systems (DEBS 2020).

CC-BY 4.0. �is is the author’s preprint version of the camera-ready article. A full version of this paper is published in the proceedings of 14th ACMInternational Conference on Distributed and Event-Based Systems (DEBS 2020).

of a state can be saved in the trigger’s context and accessed by othertriggers.

If we consider a state machine to be itself a state, we can seam-lessly compose ASL de�nitions in other state machines with itstriggers and connections. Amazon Step Functions, however, is morelimited in terms of task extensibility since we are given a closed setof state types. We will explain here how these are processed withtriggers:

• Task andPass states: �ese state types carry out the actualwork�ow computational logic, the rest of the state typesonly manage the state machine �ux. �e Task state relieson the asynchronous Lambda invoked to signal the nexttrigger upon its termination, whereas the Pass state signalsitself its termination event.

• Choice state: �e choice state type de�nes a set of possibleoutcomes that execute depending on some basic booleanlogic that can compare numbers, timestamps, and strings.�e trigger approach for this state is simple: for all possibleoutcomes apply the condition de�ned in the Choice stateto the condition �eld of the trigger that handles its stateexecution.

• Parallel state: �is state type de�nes a set of sub-statemachines that run in parallel. In this case, we will iterateeach sub-state machine and collect their IDs. Finally, weadd a trigger that is activated whenever any of those sub-state machines ends, but it is only executed when it hasbeen signaled by every sub-state machine.

• Map state: Similarly to the Parallel state type, this statede�nes a single sub-state machine that executes for ev-ery element in an iterable data structure input in parallel.Before executing the sub-state machines, we �rst add atrigger that, during its action execution, checks the lengthof the iterable object (which is the number of parallel statemachines, unknown until execution), and registers it tothe trigger context that handles the sub-state machinestermination stating how many of them it should wait for.

• Wait state: �e Wait state type waits for a certain amountof seconds, or until a timestamp is reached before contin-uing. It can be implemented by registering the activationevent production that activates the trigger to an externaltime-based scheduler.

• Fail and Succeed states: �e Fail and Succeed states stopthe execution of the state machine and determine if it exe-cuted successfully or failed. It can be implemented assign-ing special actions to their triggers that end the executionof the work�ow.

Figure 4 depicts how an ASF state machine is orchestrated bytriggers.

5.3 Work�ow as Code and Event Sourcing�e trigger service is also useful to reactively invoke an externalscheduler because of state changes caused by some condition. Forexample, Work�ow as Code systems like PyWren or Azure DurableFunctions represent state transitions as asynchronous functioncalls (async/await) inside code wri�en in Python or C#. Asynchro-nous invocations and futures in PyWren or async/await calls in

RunFirst

$init

RunFirst

Map

Map

Outcome1

Outcome1

Outcome2

Outcome2

MapTask

MapTask

StateMachine2

StateMachine2

StateMachine1

StateMachine1

StateMachine3

StateMachine3

Fork

StateMachine0

StateMachine0

$end

<<branch>>

<<choice>>

<<fan-out>>

<<fan-in>>

<<branch join>>

Figure 4: Triggers representation of an ASF state machine

Azure Durable Functions simplify code so developers can writesynchronous-like code that suspends and continues when eventsarrive.

�e model supported by Azure Durable Functions is reactive andevent-based, and it relies on event sourcing to restart the function toits current state. We can use dynamic triggers to support externalschedulers like Durable Functions that suspend their executionuntil the next event arrives. For example, let’s look at this PyWrencode:

import pywren ibm cloud as pywren

def my function(x):

return x + 7

pw = pywren.ibm cf executor()

future = pw.call async(my function , 3)

res = future.result()

futures = pw.map(my function , range(res))

print(pywren.get result(futures))

In this code, the functions call async and map are used to invokeone or many functions. PyWren code like this is executed normallyin the client in a notebook, which is usually adequate for shortrunning work�ows. But what if we want to execute a long-runningwork�ow with PyWren in a reactive way? �e solution is to runthis PyWren code in Trigger�ow reacting to events. Here, priorto perform any invocation, PyWren can register the appropriatetriggers, for example:

call async(my function, 3): Inside this code we will dynami-cally register a function termination trigger.

map(my function, range(res)): Inside this code we will dy-namically register an aggregate trigger for all functions in the map.

7

Page 8: Triggerflow: Trigger-based Orchestration of Serverless ...

CC-BY 4.0. �is is the author’s preprint version of the camera-ready article. A full version of this paper is published in the proceedings of 14th ACMInternational Conference on Distributed and Event-Based Systems (DEBS 2020).

CC-BY 4.0. �is is the author’s preprint version of the camera-ready article. A full version of this paper is published in the proceedings of 14th ACMInternational Conference on Distributed and Event-Based Systems (DEBS 2020).

CC-BY 4.0. �is is the author’s preprint version of the camera-ready article. A full version of this paper is published in the proceedings of 14th ACMInternational Conference on Distributed and Event-Based Systems (DEBS 2020).

TriggerCondition:

Join

PyWren code

TerminationEvents

map( )

call_async( ) EventsAlready

invoked?

Asynchronous invoke

Continueexecution

Add dynamictrigger

[Yes]

[No]

Stopexecution

Trigger Action: Replay event sourcing code Serverless

Function

Start execution

Figure 5: Life cycle of an event sourcing-enabled work�owas code with IBM-PyWren as external scheduler.

A�er trigger registration for each function, the function canbe invoked and the orchestrator function could decide to suspenditself. It will be later activated when the trigger �res.

To ensure that this PyWren code can be restarted and continuefrom the last point, we use event sourcing. When the orchestratorcode is launched, an event sourcing action will re-run the codeacquiring the results of functions from termination events. It willthen be able to continue from the last point.

In our system prototype, the event sourcing is implemented intwo di�erent ways: native and external scheduler.

In the native scheduler, the orchestration code is executed insidea Trigger�ow Action. Our Trigger�ow system enables then toupload the entire orchestration code as an action that interacts withtriggers in the system. When Trigger�ow detects events that matcha trigger, it awakens the native action. �is code then relies onevent sourcing to catch up with the correct state before continuingthe execution. In the native scheduler, the events can be retrievede�ciently from the context and thus accelerate the replay process.If no events are received in a period, the action will be scaled tozero. �is guarantees reactive execution of event sourced code.

In the external scheduler, we use IBM PyWren [12], where the or-chestration code is run in an external system, like a Cloud Function.�en, thanks to our Trigger�ow service, the function can stop itsexecution each time it invokes for example a map(), recovering theirstate (event sourcing) when it is awaken by our TF-Worker onceall map() function activations �nished their execution. Moreover,to use our event sourcing version of PyWren, it is not required anychange in the user’s code. �is means that the code is completelyportable between the local-machine and the Cloud, so users candecide where to run their PyWren work�ows without requiringany modi�cation. �e life cycle of a work�ow using an externalscheduler can be seen in Figure 5.

6 VALIDATIONOur experimental testbed consists of 5 client machines with 4 CPUsand 16 GB RAM. On the server side, we deploy Trigge�ow on aKubernetes installation (v1.17.3) in a rack of 5 Dell PowerEdgeR430 (2 CPUs Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz - 8Cores/CPU - 32 Logical processors) machines with 16GB RAM. Allof these machines, including the clients, are connected via 10GbEnetwork, and run Ubuntu Server 19.04. For the experiments we useKa�a 2.4.0 (Scala 2.13), Redis 5.0.7, KEDA 1.3.0 and Knative 0.12.0.

6.1 Load test�e load test objective is to demonstrate that our system can supporthigh-volume event processing workloads in an e�cient way. �is ismandatory if we want to support the execution of high performancescienti�c work�ows.

For the �rst experiment, we want to measure how many eventsper second can be processed by a worker that �lters events from amessage broker like Ka�a or Redis Streams. Tables 1 and 2 showthe time to process 200K events in a container using di�erent CPUresources (0.25, 0.5, 1 and 2). Noop means that the worker is notdoing any operation on the event. Join refers to aggregated �ltersthat process joining for di�erent map jobs with 2000 functions each.As we can see, the performance numbers tell that the system canprocess thousands of events per second.

�e second experiment consists of measuring the actual resourceusage (CPU and mem) of 1 Core worker using Redis by injectingdi�erent numbers of events per second (form 1K e/s to 12K e/s).Figure 6 shows that, with a constant memory footprint, the CPUresource can cope with increasing number of events per second.

Table 1: Maximum number of processed events/second us-ing Redis Streams

Cores Ev. Noop (s) Noop (e/s) Join (s) Join (e/s)

0.25 200K 56.09 3565 59.83 3342

0.5 200K 28.03 7135 30.25 6611

1 200K 14.17 14114 14.56 13736

2 200K 11.48 17421 12.02 16638

Table 2: Maximum number of processed events/second us-ing Kafka

Cores Ev. Noop (s) Noop (e/s) Join (s) Join (e/s)

0.25 200K 43.89 4556 49.30 4056

0.5 200K 18.01 11104 23.99 8336

1 200K 9.34 21413 11.31 17683

2 200K 5.68 35211 7.56 26450

6.2 Auto-scalingIn this case, the objective is to demonstrate that TF-Workers canscale up and down based on the current active work�ows. Wedemonstrate here that our Trigger�ow implementation on top ofKubernetes and KEDA can auto-scale on demand based on thenumber of events received in di�erent work�ows.

For this experiment, we use the entire testbed described above,and set the TF-Worker to use 0.5 CPUs and 256 MB of RAM. �e testconsists of 100 synthetic work�ows that send events during somearbitrary seconds, pause the work�ow for a while (simulating along-running action), then resume sending events, and �nally stopthe work�ow. �e test works as follows: It �rst starts 50 work�owsat a constant rate of 2 work�ows per second), a�er 100 seconds itstarts another 50 work�ows at a rate of 3 work�ows per second,

8

Page 9: Triggerflow: Trigger-based Orchestration of Serverless ...

CC-BY 4.0. �is is the author’s preprint version of the camera-ready article. A full version of this paper is published in the proceedings of 14th ACMInternational Conference on Distributed and Event-Based Systems (DEBS 2020).

CC-BY 4.0. �is is the author’s preprint version of the camera-ready article. A full version of this paper is published in the proceedings of 14th ACMInternational Conference on Distributed and Event-Based Systems (DEBS 2020).

CC-BY 4.0. �is is the author’s preprint version of the camera-ready article. A full version of this paper is published in the proceedings of 14th ACMInternational Conference on Distributed and Event-Based Systems (DEBS 2020).

0 20 40 60 80 100 120 140 160 180Time (s)

0

2000

4000

6000

8000

10000

12000

Even

ts/s

Events/sCPUMemory

0

20

40

60

80

100

CPU

(%)

0

20

40

60

80

100

120

Mem

ory

(MB)

Figure 6: Resource utilization depending on incoming num-ber of events/second (1 Core w/ Redis)

and �nally, a�er 70 seconds, it starts 15 more work�ows at a rateof also 3 work�ows per second.

�e results are depicted in Figure 7. It shows how the TF-Workersscale up when the work�ows start to send events, and scale down,even to zero (second 210 and 250), when the active work�ows donot produce any event due to a long-running action. We can seehow Trigger�ow leverages the KEDA auto-scaler to activate orpassivate work�ows. Trigger�ow is automatically providing faulttolerance, event persistence, and context and state recovery eachtime a work�ow is resumed.

6.3 Completion time and overhead�e validation in this section demonstrates that Trigger�ow showscomparable overhead to public Cloud orchestration systems. Wemust be fair here: we are comparing an implementation of Trigger-�ow over dedicated and idle resources in our rack against publicmulti-tenant cloud services that may be used by thousands of users.�e objective is not to claim that our system is be�er than them,but only to demonstrate that we can reach comparable overheadand performance. Furthermore, most cloud orchestration systemsare not designed for highly concurrent and parallel jobs, which canlimit their performance in those scenarios.

We evaluate the run-time overhead of Amazon’s, IBM’s, andMicroso�’s orchestration services. We consider as overhead all thetime spent outside the functions being composed, which is easyto measure in all platforms. For a sequential composition д of nfunctions д = f1 ◦ f2 ◦ · · · ◦ fn , it is just:

overhead (д) = exec time(д) −n∑i=1

exec time(fi ).

It is important to note that our overhead de�nition includesthe delays between function invocations, and the execution timeof the orchestration function (for IBM Composer and ADF) or thedelays between state transitions (for ASF). In the case of Trigger�ow,the overhead depends on all the services in the architecture—i.e.,latency to access Ka�a or Redis, latency to invoke functions in IBMCF, etc.

For all the tests, we use a single TF-Worker with 0.5 CPU Coresand 64MB of RAM, and we list only the results when functionsare in warm state. �is implies that we do not consider the coldstart of spawning the function containers and VMs. Our focusis on measuring the overhead of running function compositions.

0

20

40

60

80

Wor

kers

Active workers (pods)

0 100 200 300 400 500Time (seconds)

0

20

40

60

80

Wor

kflo

ws

Active workflows

Figure 7: TF-Worker auto-scaling test using KEDA

All the tests are repeated 10 times. �e results displayed are themedian of those 10 samples and the standard deviation for theerror intervals. Measurements are done during March of 2020. ForIBM Cloud Functions (IBM CF) and AWS Lambda executions, weuse the Python 3.8 runtime. �e exception is Azure, which doesnot currently support Python for ADF, but C#. �e orchestrationfunctions are implemented in the default language available ineach platform: Node.js for IBM Composer, and C# for ADF. ASForchestration is speci�ed in Amazon States Language (JSON-basedformat) using the console editor.

For the sequential work�ows, we quantify the overhead for se-quential compositions of length n in {5, 10, 20, 40, 80}. For sim-plicity, all the functions in the sequence are the same: a functionthat sleeps for 3s, and then returns. For the parallel work�ows,we de�ne a work�ow with a single parallel stage composed of nparallel instances of the same task, with n ranging from 5 to 320,and doubling each time. �is task has a �xed duration of 20 sec-onds. Consequently, any execution of the experiment should ideallylast 20 seconds, irrespective of n or the environment. To put it inanother way, in an ideal system with no overhead, the executiontime of the n concurrent tasks should match that of a single task.�erefore, we compute the overhead of the orchestration systemby subtracting the �xed time of a single task, namely 20 seconds,from the total execution time.

6.3.1 DAGs and State Machines. For the DAG and State Ma-chine use cases, we evaluated our DAG engine interface againstIBM Composer, AWS Step Functions, AWS Step Functions Express,and Azure Durable Functions. It is important to state that theseresults are exactly the same we would get for the State Machineimplementation over Trigger�ow. Sequences and parallel jobs instate machines and DAGs use the same triggers.

Sequential work�ows. �e resultant overhead is representedin Figure 8. In general, Trigger�ow’s overhead is higher than inother orchestration systems. In this case, almost all overhead comesfrom the IBM Cloud Functions invocation latency using its publicAPI, which is about 0.13s. When multiplied by 80 functions, itadds up to approximately 10 seconds of overhead. Amazon StepFunctions, however, can use internal trigger protocols rather thanthe public API, which explains the lower invocation latency. Inaddition, it seems that using Express Work�ows does not provide aconsiderable speed improvement compared to regular ASF whenusing sequential workloads, so they are probably not worth the

9

Page 10: Triggerflow: Trigger-based Orchestration of Serverless ...

CC-BY 4.0. �is is the author’s preprint version of the camera-ready article. A full version of this paper is published in the proceedings of 14th ACMInternational Conference on Distributed and Event-Based Systems (DEBS 2020).

CC-BY 4.0. �is is the author’s preprint version of the camera-ready article. A full version of this paper is published in the proceedings of 14th ACMInternational Conference on Distributed and Event-Based Systems (DEBS 2020).

CC-BY 4.0. �is is the author’s preprint version of the camera-ready article. A full version of this paper is published in the proceedings of 14th ACMInternational Conference on Distributed and Event-Based Systems (DEBS 2020).

5 10 20 40 80

Number of sequential functions

101

102

103

104

105

Ove

rhea

d (m

s)

Triggerflow (IBM CF + Redis)AWS Step Functions ExpressAWS Step Functions

IBM ComposerAzure Durable Functions

Figure 8: DAG overhead comparison for sequences

5 10 20 40 80 160 320

Number of parallel functions

101

102

103

104

105

Ove

rhea

d (m

s)

Triggerflow (IBM CF + Redis)AWS Step Functions ExpressAWS Step Functions

IBM ComposerAzure Durable Functions

Figure 9: DAG overhead comparison for parallel work�ows

extra cost for this kind of job. IBM Composer is the fastest insequences, but with the drawback of its limitation of only 50 tran-sitions per composition. Finally, Azure Durable Functions presentcompetent overheads, although being quite unstable for short se-quences. �is is probably because ADF is designed and optimizedfor long-running sequential workloads.

Parallel work�ows. For small-sized compositions (5 to 10), wecan see in Figure 9 that Trigger�ow and AWS Step Functions yieldsimilar overhead, both being outperformed by Express Work�owsnonetheless. Express Work�ows has a wider range of error though,while regular Step Functions, Trigger�ow and IBM Composer aremore stable. Express Work�ows perform similarly regardless ofthe number of parallel functions until it reaches about 80, whenits performance drops drastically and the overhead skyrockets forno apparent reason. From 80 functions and up, Express Work�owsand IBM Composer have similar overheads.

From 80 parallel functions and up, we also see that Trigger�owhas the lowest overhead, proving that event-driven function com-position is indeed well suited for parallel function joining.

Azure Durable Functions yield the worst results when used forsmall-sized function joining and is considerably unstable. However,it turns to be equivalent to the other orchestration systems whenjoining a higher number of concurrent functions.

6.3.2 Workflow as Code and Event Sourcing. �e objective hereis to evaluate Work�ow as Code and event sourcing overheads in

Trigger�ow compared to Azure Durable Functions. We compareboth sequential and parallel constructs.

For the event sourcing use case, we evaluate both the externalscheduler (IBM-PyWren) and the native scheduler (Trigger�ow ac-tion). One the one hand, we measure and compare the performanceof our modi�ed version of IBM-PyWren for Trigger�ow with theoriginal version of IBM-PyWren (external scheduler). In this casewe evaluate 4 di�erent scenarios: 1) �e original IBM-PyWren,which makes use of IBM Cloud Object Storage (COS) to store theevents and results. 2) �e modi�ed version of IBM-PyWren forTrigger�ow that stores the results in COS (original IBM-PyWrenbehavior), but sends the termination events trough a Redis Stream.3) �e Trigger�ow IBM-PyWren that sends the events and resultstrough a Ka�a Topic. And 4) the Trigger�ow IBM-PyWren thatsends the events and results trough a Redis Stream.

On the other hand, we evaluate the native Trigger�ow eventsourcing scheduler, where the orchestration code is executed as partof the trigger action. In this case we compare the results againstthe Azure Durable Functions (ADF) service, which is the only FaaSwork�ow orchestration service that employs an event sourcingtechnique to execute the work�ows.

Sequential work�ows. Figure 10 shows the overhead evolutionwhen increasing the length of the sequence. �e overhead addedby both the native and external schedulers grows up linearly basedon the number of functions in the sequence. As we can see, theresults are very stable, meaning that the behavior is implementation-related, and not a problem with resources.

For the external scheduler, we can see comparable performancebetween the original IBM-PyWren and our modi�ed version forTrigger�ow. Overhead evolves similarly in all scenarios. PyWrenhas to serialize and upload the function and the data to COS beforeexecuting it, creating overhead common for all scenarios. �eremaining overhead comes from the place and the way these eventsare retrieved to recover the state of the execution (event sourcing).�is means that the event source service—either COS, Ka�a, orRedis—, has direct impact on these results. For example, the maindrawback of using COS in both the original (1) and Trigger�ow (2)versions of IBM-PyWren is that they have to individually downloadthe results from COS. �is fact substantially increases the totaltime needed to execute a work�ow, since for each step it has toretrieve all the previous events. In this case, for a work�ow with nsteps, IBM-PyWren has to perform a total of n(n + 1)/2 requests.In contrast, in the scenarios where IBM-PyWren does not use COS,and stores the events in a Ka�a Topic (3) or a Redis Stream (4), itonly needs one request to retrieve all the events in each step. �en,it only needs n requests to these services to complete the executionof a work�ow. If we compare scenarios 2 and 3, we see be�erperformance if we use a Redis Stream instead of a Ka�a Topic. �isis mainly caused by the Ka�a library, which adds a �xed overheadof 0.25s each time the orchestration function is awaken and createsa consumer. �is means that using a Ka�a Topic as event store hasa �xed overhead of n ∗ 0.25 seconds.

For the Trigger�ow native scheduler, it is important to note thatthe functions are already deployed in the cloud (in contrast withPyWren that has to serialize and upload them each time). More-over, the orchestration code is execute within the TF-Worker that

10

Page 11: Triggerflow: Trigger-based Orchestration of Serverless ...

CC-BY 4.0. �is is the author’s preprint version of the camera-ready article. A full version of this paper is published in the proceedings of 14th ACMInternational Conference on Distributed and Event-Based Systems (DEBS 2020).

CC-BY 4.0. �is is the author’s preprint version of the camera-ready article. A full version of this paper is published in the proceedings of 14th ACMInternational Conference on Distributed and Event-Based Systems (DEBS 2020).

CC-BY 4.0. �is is the author’s preprint version of the camera-ready article. A full version of this paper is published in the proceedings of 14th ACMInternational Conference on Distributed and Event-Based Systems (DEBS 2020).

5 10 20 40 80

104

105 IBM-PyWren COS

TF + IBM-PyWren COS + RedisTF + IBM-PyWren + KafkaTF + IBM-PyWren + Redis

5 10 20 40 80

103

104

TF + IBM CF + RedisAzure DF

Number of sequential functions

Ove

rhea

d (m

s)

Figure 10: Event sourcing overhead comparison for se-quences. PyWren vs TF-PyWren on the le� side. Trigger�owvs Azure Durable Functions on the right side.

5 10 20 40 80 160 320102

103

104

IBM-PyWren COSTF + IBM-PyWren COS + RedisTF + IBM-PyWren + KafkaTF + IBM-PyWren + Redis

5 10 20 40 80 160 320101

102

103

104

105

TF + IBM CF + Redis Azure DF

Number of parallel functions

Ove

rhea

d (m

s)

Figure 11: Event sourcing overhead comparison for parallelwork�ows. PyWren vs TF-PyWren on the le� side. Trigger-�ow vs Azure Durable Functions on the right side.

contains all the events loaded in memory, so it does not need toretrieve them from the event source (Ka�a, Redis) in each step.Compared to ADF, we obtain similar overhead. As stated in theprevious section, the overhead comes mainly from the fact that in-voking an action in IBM CF service takes around 0.13s. �is meansthat, for a work�ow of n steps, Trigger�ow has a �xed overhead ofn ∗ 0.13 when using IBM CF.

Parallel work�ows. For this experiment, we evaluate the samescenarios described above. �e results are depicted in Figure 11.In this case, for the external scheduler, the original IBM-PyWrenand the Trigger�ow IBM-PyWren version have also similar over-head, being scenario 4—which uses Redis as event store—the bestapproach. In the Ka�a scenario (3), the overhead of 0.25s describedabove is negligible, since in this experiment the orchestration func-tion is awaken only once. �e main di�erence in the performancebetween scenarios 1 and 2 is that the original IBM-PyWren is run-ning all the time and polling the results as they are produced. Incontrast, in the Trigger�ow version of IBM-PyWren that uses COS(2), the TF-Worker �rst waits for all activations to �nish to awakethe orchestration function, that then has to retrieve all the eventsand results from COS. Finally, with the native scheduler, Trigger�owis faster for parallel work�ows compared to ADF.

0 10 20 30 40 50 60

Time (s)

0

100

200

300

400

500

Eve

nts

proc

esse

d

Triggerflow DAG workerSystem failureTriggerflow DAG respawned workerIBM PyWren

Figure 12: Scienti�c work�ow execution progression overtime, with an intended system failure at the 20th second.

6.4 Scienti�c Work�owsWe adapted a geospatial scienti�c work�ow, that was originallyimplemented with IBM-PyWren, to work with our DAGs interface.�e objective of the work�ow is to compute the evapotranspirationand thus the water consumption of the crops from a set of a parti-tioned geographical region using the Penman-Monteith equation.Due to the nature of the work�ow, and despite the optimizationsapplied, the work�ow’s execution time is similar to that providedby IBM-PyWren. �e main di�erence lies in the work�ow pro-gramming model: DAGs are more geared towards dissecting thework�ow into independent tasks and their dependencies, whilePyWren opts for a map-reduce model. An important point in favorof Trigger�ow is its automatic and transparent fault tolerance pro-vided by the event source and trigger persistent storage. Figure 12depicts the progression of a work�ow run of the scienti�c work�ow,using Ka�a as the event source and Redis for the trigger storage.To check the system’s fault tolerance, we intentionally stopped theexecution of the Trigger�ow worker and the IBM-PyWren execu-tion in the 20th second of the work�ow execution. Trigger�owrapidly recovers the trigger context from the database and the un-commi�ed events from the event source, and �nishes its executioncorrectly. In contrast, IBM-PyWren stops and loses the state of thework�ow, having to re-execute the entire work�ow wasting timeand resources.

To demonstrate Trigger�ow’s ability to introspect triggers withits Rich Trigger API, we have also implemented a service over theDAGs interface that automatically and transparently prewarmsfunction containers on IBM Functions to increase the e�ciency andoverall parallelism, reduce total execution time and mitigate strag-gler functions e�ects in work�ows that require high performanceand throughput. Figure 13 shows its e�ects. �anks to Trigger-�ow’s interception mechanisms we can also transparently applyother data pre-fetching optimizations in scienti�c work�ows.

7 CONCLUSIONSWe have presented in this paper Trigger�ow: a novel building blockfor controlling the life cycle of Cloud applications. As more applica-tions are compiled to the Cloud, our system permits to encode theirexecution �ow as reactive triggers in an extensible way. �e noveltyof our approach relies on four key aspects: serverless design, ex-tensibility, support for heterogeneous work�ows, and performancefor high-volume workloads.

11

Page 12: Triggerflow: Trigger-based Orchestration of Serverless ...

CC-BY 4.0. �is is the author’s preprint version of the camera-ready article. A full version of this paper is published in the proceedings of 14th ACMInternational Conference on Distributed and Event-Based Systems (DEBS 2020).

CC-BY 4.0. �is is the author’s preprint version of the camera-ready article. A full version of this paper is published in the proceedings of 14th ACMInternational Conference on Distributed and Event-Based Systems (DEBS 2020).

CC-BY 4.0. �is is the author’s preprint version of the camera-ready article. A full version of this paper is published in the proceedings of 14th ACMInternational Conference on Distributed and Event-Based Systems (DEBS 2020).

0 20 40 60Time (s)

0

50

100

150

200

250

300

Num

ber

of c

oncu

rren

t fun

ctio

ns WarmCold

0 10 20 30 40Time (s)

0

100

200

300

400

500

600

Num

ber

of c

oncu

rren

t fun

ctio

ns WarmCold

Figure 13: (a) Parallelism and total execution time in a se-quential work�ow that increases its map size and executiontime every step. (b) Parallelism and total execution time ofa single map task with high concurrency.

TriggerFlow can become an extensible control plane for deploy-ing reactive applications in the Cloud. We implemented and val-idated di�erent orchestration systems based on State Machines(ASF), Directed Acyclic Graphs (Air�ow), and Work�ow as Code(PyWren).

Finally, the major limitations about TriggerFlow as an event-based orchestration system are debuggability and developer experi-ence. As an open source project, TriggerFlow would clearly bene�tfrom tools and user interfaces to simplify the overall observabilityand life-cycle support of the system.

ACKNOWLEDGMENTS�is work has been partially supported by the EU Horizon 2020programme under grant agreement No 825184.

REFERENCES[1] [n.d.]. Trigger�ow. h�ps://github.com/trigger�ow/trigger�ow.[2] Ioana Baldini, Perry Cheng, Stephen J. Fink, Nick Mitchell, Vinod Muthusamy, Ro-

dric Rabbah, Philippe Suter, and Olivier Tardieu. 2017. �e Serverless Trilemma:Function Composition for Serverless Computing. In Proceedings of the 2017 ACMSIGPLAN International Symposium on New Ideas, New Paradigms, and Re�ectionson Programming and So�ware (Onward! 2017). 89–103.

[3] Daniel Barcelona-Pons, Pedro Garcıa-Lopez, Alvaro Ruiz, Amanda Gomez-Gomez, Gerard Parıs, and Marc Sanchez-Artigas. 2019. FaaS Orchestrationof Parallel Workloads. In Proceedings of the 5th International Workshop on Server-less Computing (WOSC �19). Association for Computing Machinery, New York,NY, USA, 25��30. h�ps://doi.org/10.1145/3366623.3368137

[4] Walter Binder, Ion Constantinescu, and Boi Faltings. 2006. Decentralized orches-tration of composite web services. In 2006 IEEE International Conference on WebServices (ICWS’06). IEEE, 869–876.

[5] Benjamin Carver, Jingyuan Zhang, Ao Wang, and Yue Cheng. 2019. In Searchof a Fast and E�cient Serverless DAG Engine. arXiv preprint arXiv:1910.05896(2019).

[6] Wei Chen, Jun Wei, Guoquan Wu, and Xiaoqiang Qiao. 2008. Developing aconcurrent service orchestration engine based on event-driven architecture. InOTM Confederated International Conferences” On the Move to Meaningful InternetSystems”. Springer, 675–690.

[7] Dong Dai, Yong Chen, Dries Kimpe, and Rob Ross. 2018. Trigger-based Incre-mental Data Processing with Uni�ed Sync and Async Model. IEEE Transactionson Cloud Computing (2018).

[8] Sadjad Fouladi, Francisco Romero, Dan Iter, Qian Li, Shuvo Cha�erjee, ChristosKozyrakis, Matei Zaharia, and Keith Winstein. 2019. From Laptop to Lambda:Outsourcing Everyday Jobs to �ousands of Transient Functional Containers. In2019 USENIX Annual Technical Conference (USENIX ATC 19). USENIX Association,Renton, WA, 475–488. h�ps://www.usenix.org/conference/atc19/presentation/fouladi

[9] Sadjad Fouladi, Riad S Wahby, Brennan Shackle�, Karthikeyan Vasuki Balasub-ramaniam, William Zeng, Rahul Bhalerao, Anirudh Sivaraman, George Porter,and Keith Winstein. 2017. Encoding, fast and slow: Low-latency video process-ing using thousands of tiny threads. In 14th USENIX Symposium on Networked

Systems Design and Implementation (NSDI 17). 363–376.[10] Andreas Geppert and Dimitrios Tombros. 1998. Event-based distributed work�ow

execution with EVE. In Middleware�98. Springer, 427–442.[11] Sangjin Han and Sylvia Ratnasamy. 2013. Large-scale computation not at the

cost of expressiveness. In Presented as part of the 14th Workshop on Hot Topics inOperating Systems.

[12] IBM. [n.d.]. IBM PyWren. h�ps://github.com/pywren/pywren-ibm-cloud.[13] Abhinav Jangda, Donald Pinckney, Yuriy Brun, and Arjun Guha. 2019. Formal

foundations of serverless computing. Proceedings of the ACM on ProgrammingLanguages 3, OOPSLA (2019), 1–26.

[14] Eric Jonas, Qifan Pu, Shivaram Venkataraman, Ion Stoica, and Benjamin Recht.2017. Occupy the cloud: Distributed computing for the 99%. In Proceedings of the2017 Symposium on Cloud Computing. ACM, 445–451.

[15] Shannon Joyner, Michael MacCoss, Christina Delimitrou, and Hakim Weath-erspoon. 2020. Ripple: A Practical Declarative Programming Framework forServerless Compute. arXiv preprint arXiv:2001.00222 (2020).

[16] KEDA. [n.d.]. Kubernetes-based event-driven autoscaling. h�ps://keda.sh/.[17] Guoli Li and Hans-Arno Jacobsen. 2005. Composite subscriptions in content-

based publish/subscribe systems. In ACM/IFIP/USENIX International Conferenceon Distributed Systems Platforms and Open Distributed Processing. Springer, 249–269.

[18] Pedro Garcıa Lopez, Marc Sanchez-Artigas, Gerard Parıs, Daniel Barcelona Pons,Alvaro Ruiz Ollobarren, and David Arroyo Pinto. 2018. Comparison of faasorchestration systems. In 2018 IEEE/ACM International Conference on Utility andCloud Computing Companion (UCC Companion). IEEE, 148–153.

[19] Maciej Malawski, Adam Gajek, Adam Zima, Bartosz Balis, and Kamil Figiela. inpress. Serverless execution of scienti�c work�ows: Experiments with HyperFlow,AWS Lambda and Google Cloud Functions. Future Generation Computer Systems(in press).

[20] Christopher Mitchell, Russell Power, and Jinyang Li. 2012. Oolong: asynchronousdistributed applications made easy. In Proceedings of the Asia-Paci�c Workshopon Systems. ACM, 11.

[21] Norman W Paton and Oscar Dıaz. 1999. Active database systems. ACMComputingSurveys (CSUR) 31, 1 (1999), 63–103.

[22] Pnina So�er, Annika Hinze, Agnes Koschmider, Holger Ziekow, Claudio Di Cic-cio, Boris Koldehofe, Oliver Kopp, Arno Jacobsen, Jan Surmeli, and Wei Song.2019. From event streams to process models and back: Challenges and opportu-nities. Information Systems 81 (2019), 181–200.

[23] Erwin Van Eyk, Johannes Grohmann, Simon Eismann, Andre Bauer, LaurensVersluis, Lucian Toader, Norbert Schmi�, Nikolas Herbst, Cristina Abad, andAlexandru Iosup. 2019. �e SPEC-RG Reference Architecture for FaaS: FromMicroservices and Containers to Serverless Platforms. IEEE Internet Computing(2019).

[24] Tim Wagner. [n.d.]. �e Serverless Supercomputer. h�ps://read.acloud.guru/h�ps-medium-com-timawagner-the-serverless-supercomputer-555e93bbfa08.

12