Automate Your Big Data Workflows (SVC201) | AWS re:Invent 2013
7 Use Cases in 7 Minutes Each : The Power of Workflows and Automation (SVC101) | AWS re:Invent 2013
-
Upload
amazon-web-services -
Category
Technology
-
view
4.975 -
download
0
Transcript of 7 Use Cases in 7 Minutes Each : The Power of Workflows and Automation (SVC101) | AWS re:Invent 2013
© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.
Amazon Simple Workflow
7 use cases : 7 minutes each
Sunjay Pandey, Amazon Simple Workflow
November 14, 2013
What is the Amazon SWF service?
Coordinate, track, and audit
programs
With …
Multiple steps on multiple machines
Booking travel to Vegas…
A multi-step process:
Update seat
inventory (Batch)
Select
seats
Provide
credit card
Confirm
inventory
Confirm
Trip
Update
inventory
Send confirmation
Search for
flights
( Web, mobile, SMS …)
Reliable, Scalable,
Auditable?
Update seat
inventory (Batch)
Select
seats
Provide
credit card
Confirm
inventory
Confirm
trip
Update
inventory
Send confirmation
Search for
flights
( Web, mobile, SMS …)
How to handle these challenges
Amazon SWF provides …
Control engine
Manage state, route
tasks, distribute load
APIs HTTP, task processing
instructions
Programming framework Convenient libraries
Key Amazon SWF concepts
• Discrete steps in your
application, processors
Workflow (deciders)
• Control logic for workflows:
execution order, retry policies,
timer logic, etc. – Decision tasks
Activity workers
Update seat
inventory (Batch)
Select
seats
Provide
credit card
Confirm
inventory
Confirm
trip
Update
inventory
Send confirmation
Search for
flights
( Web, mobile, SMS …)
Update seat
inventory (Batch)
Select
seats
Provide
credit card
Confirm
inventory
Confirm
trip
Update
inventory
Send confirmation
Search for
flights
( Web, mobile, SMS …)
Workflow (decider)
Activity
SWF customers are diverse
Private, venture-backed, publicly traded
Java, Ruby, C#, PHP, Python, on premises,
fully AWS
Media, e-biz, transportation, entertainment,
space exploration, life sciences, non-profit,
gaming, …
Many use cases
What they have in common…
Multiple steps
Multiple machines
Higher potential for failures
Need resilience, scale, auditability
Customers
Thousands of clients
100s of millions of pieces of content
100s of millions of unique visitors per month
10s of billions of pageviews per month
Java-based workflows
Austin San Francisco New York Engineering offices
Content moderation as a service
Content Moderation System (CMS1)
Humans read every piece of content, twice !
Content coded with tags: product flaw,
shipping issues …
Step 1 Step 1
Step 1 Step 1
Step 1 Step 1
Step 1 Step 1
Review Approve/Rej
ect
Publish &
customer
Customer
review
submission
Content Moderation System (CMS1)
Design challenges
• Latent
• Rigid state machine
• Multi-cluster environment
• Balancing, scheduling, admin was hard.
• Acquiring new company to integrate didn’t help
things
Content Moderation System (CMS2)
Moderation as a service
• One service, many types of content moderated
• Fully de-coupled from other systems
• Real-time, rich prioritization of tasks
• Control logic flexible, built with SWF Flow
framework (AWS Java SDK)
• Execution managed by SWF
Moderation Workflow
50k-150k Parallel executions
3-5 activities
2-3 days duration
SWF, AWS architecture brought
Benefits
• Efficiency
Moderation time (submission – publish) down
35% since launching CMS2.
• Flexibility
Started with few activities, was able to scale
logic as we grow
4 Million+ Lessons Taught in 12 years
In 2012 > 25% of all golf lessons in US
Next Competitor < 1%
Strategic Partners: Sports Illustrated, GolfSmith
PHP-based workflows
Austin San Francisco New York Engineering offices
User generated video processing
Video capture (phone)
Upload Pre-
process Telestrate (coach)
Account updated
Post processing
Web lesson
A Golf Pro in Your Pocket ..
• Partnership with Sports Illustrated
• Mobile App = virtual golf lessons
My Pro to Go
Pre-process
Account updated
Telestrate
(coach)
Before Simple Workflow & AWS
Design challenges
• Traditional datacenter
• Limitations: scale, dynamic Growth, capital costs
• Single workflow & web server
• Patchwork of PHP scripts
• No logging, monitoring or Reporting
• Lesson processing pre- AWS (2011)
UBUNTU
APACHE
WEBSERVER
UBUNTU
MYSQL
SERVER
TWO UBUNTU
FFMPEG
TRANSCODERS
FT
P Environment challenged to
produce 88,740 lesson
videos
per month.
Lesson storage on local system was 1TB.
Regular purges of data had to be done due
to space limitations.
FIBER
CHANNE
L
SAN
CORPORATE
EXCHANGE
SERVER
• Lesson processing on AWS (2013)
LOAD
BALANCED
EC2 APACHE
WEBSERVERS
SIMPLE WORKFLOW
SERVICE
All uploads from centers
open a SWF workflow,
allowing all stages to be
tracked and errors to be
automatically corrected or
staff to be alerted.
PAIR OF
SWF
DECIDERS
SIMPLE STORAGE SYSTEM
(S3)
All lesson and
marketing resources
new stored on Amazon
Simple Storage Service
(S3). We now hold
multiple versions of
videos, store 100TB of
data with no fear of any
limits.
8-20 SWF workers are
deployed based on
load. We regularly
produce 655,000
lesson videos per
month.
Easy to scale and manage. Developers
able to quickly and easily spin up clones of
the production servers.
CLOUD WATCH
All systems monitored
by custom alarms,
immediately notifying
techs before
emergencies happen
RELATIONAL DATABASE
SERVICE As a managed
service, we don’t worry about
patching or
maintenance. Point
in time recovery
and the ability to
quickly spawn
clones aid
development.
CLOUD FRONT CDN
Mass market videos
now distributed via
content delivery
network.
SIMPLE EMAIL SERVICE Automated emails go
through Amazon
Simple Email Service
(SES), increasing
deliverability and
decreasing load on
Exchange
SWF, AWS architecture brought
Benefits
• Scalability
30,000+ videos per day (30, 50, 120, ++ fps)
• Flexibility
Decouple logic (decider), limited by creativity
Loyalty marketing as a service
Connect digital offers and in-store payments
Serves: merchants, developers, publishers
Payment network < > developer apps
Ruby-based workflows
Post transaction processing
Cardspring platform connects …
Connect online behavior with in-store purchases
• Developers create card-linked apps
• Merchants choose to run campaigns “in app”
• Customers pay through POS terminals
• API receives real-time webhooks about credit card transactions
• Post-transaction data updated for merchants, developers, customers
Before Amazon SWF
Previous starting point
• Worker machines process batch file jobs +
async work outside of API requests
• Running custom job processor framework +
many ad hoc scripts
• Triggered via SNS messages, cron, & jobs
controller
Before Architecture Diagram
Jenkins API
SNS ELB
worker worker worker
After architecture diagram
Jenkins API
SWF
worker worker
worker
decider decider
decider
After Amazon SWF
Simpler, resilient architecture
• All scripts/processors consolidated into 30
distributed workflows
• Implement priorities via different task lists
• Easy to scale
• Easy to restart
Amazon SWF brought
Benefits
• Resilient
Restart in the case of failure
• Efficient consolidation, Easy Management
30 automated workflows do all the processing
CRM + ERP for non-profits, associations
Mid-to-large sized associations
100+ customers
C#/.NET based workflows
Photo Credit: flickr/hectorir
Complex order processing
CRM 360° Suite
Fully featured with asynchronous needs
• Full ERP: membership, financial, fundraising, e-
marketing portal, API platform, etc.
• File processing/Amazon S3 uploads
• Amazon CloudSearch encoding
• Long-running processes like membership approvals
Before Amazon SWF
Windows, Quartz.net
• Unreliable, no replacement
• Poor performance
MSMQ (Microsoft Message Queue) • Better throughput, but queues don’t know how to run
tasks
• Impossible to scale remote queues
• Still had to use Quartz.net for recurring tasks
After Amazon SWF
Scalable, manageable architecture
• Replaced old system with ≈ 24 workflows
• Encouraged good design – broke up code into
“activities”
• Replaced recurring activities with “always on”
workflows that woke up on scheduled timers
Post SWF architecture
SWF benefits
• Stability
System never crashes
• Replay, retry-ability
30 automated workflows do all the processing
Customers: Reebok, Jansport, Nine West,
Solutions: Shoppable images, Dynamic
imaging, Visual product customization
Java-based workflows
Chicago San Francisco New York
High-performance image processing
Before Amazon SWF
Scalability challenges
• 1 product = 1000s of images
• Generate images by rendering 3D models
• Poor performance, scalability
After architecture
• SWF brings parallelism
to both the 3D rendering
+ post-processing
• Image processing time
dropped 30 – 80%
SWF benefits
• Performance
Image processing times dropped 30 – 80%
• Costs optimized
Processing machines auto scaled as necessary
ERP for technology service providers: cloud
and on-premise. Datacenters on 4 continents
5,500+ Customers with 80,000+ end users
Trouble ticketing, time/billing, calendar/syncing
for scheduling, deployments
Back-office Automation: IT Services
Processing (A-M)
Calendaring (A-M)
Processing (N-Z)
Calendaring (N-Z)
Ticketing
Workflow
System
Installation
System
Corporate Data center
• Services were manually partitioned amongst customers
• Server failure resulted in customer downtime
• Server utilization was uneven – activity varied by customer
• Adding capacity was a complex process
• Logs were stored on each machine
Before SWF – customer-specific
After SWF
• Identical worker servers
• Centralized logging to
Amazon S3
Email Processing
Calendaring
Corporate datacenter
Installation System
Workflow System
Flow
Framework
Services
are
Workers
Amazon SWF
Centralized
Logging to
Amazon S3
Amazon Web Services
• Java and Flow Framework used for deciders
• .NET services act as SWF workers
• Geo-diverse workflow coordination
100K+ workflows daily with
1.5 Million+ activities, decisions
SWF benefits
• Costs/utilization
Consistent architecture/servers all customers,
server utilization more balanced
• Simplified management, reliability
Single control panel with visibility into all services.
Split complex workflows across many identical
worker servers w/o having to maintain local state.
Multi-step, long-running user enrollment
Ridesharing/virtual carpool service
80% of users are repeat customers
Average ride < $10
100,000+ users
Python-based workflows
Enrolling private citizens
Multi-step workflows
• Headshot > Car photo > Photo of license Consent to
background check > Online training, Mentoring
session (driver ridealong)
• Processes long running, can “fail” at any step –
queues, scheduled processes would have be brittle
• Built from the beginning on Amazon SWF
SWF Architecture
Workflow Workers
(Decider)
Amazon SWF
Activity
Workers
Many, many thousands of
applications per day
With 1 decider, and 1
activity worker
Driver enrollment
process, status
checking
Application updates,
Reminders via
Email and SMS ask
drivers to move
forward in their
applications
SWF benefits
• Auditability
State of user’s application consistent
Reminder emails not duplicated
• Flexibility
Application steps can be non-linear
Users complete different steps, different times
Post transaction processing Cardspring
Video processing GolfTEC
7 of many use cases
Amazon SWF
Content moderation BazaarVoice
Order processing Membersuite
Image processing Fluid
Business/IT automation Connectwise
Partner enrollment Lyft
Resilience, reliability
Flexibility
Benefits
Amazon SWF
Scalability
Performance
Simplification
Costs, efficiency
Please give us your feedback on this
presentation
As a thank you, we will select prize
winners daily for completed surveys!
SVC101