20160715 Using Mesos Schedulers with Amazon …...2016/07/14  · Amazon ECS: Scheduling • Each...

52
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Solutions Architect, Amazon Web Services Japan July 2016, LinuxCon+ContainerCon Japan Using Mesos Schedulers with Amazon EC2 Container Service Ryosuke Iwanaga

Transcript of 20160715 Using Mesos Schedulers with Amazon …...2016/07/14  · Amazon ECS: Scheduling • Each...

Page 1: 20160715 Using Mesos Schedulers with Amazon …...2016/07/14  · Amazon ECS: Scheduling • Each scheduler allocates tasks on the cluster • Each scheduler updates the current cluster

© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Solutions Architect, Amazon Web Services Japan

July 2016, LinuxCon+ContainerCon Japan

Using Mesos Schedulerswith Amazon EC2 Container ServiceRyosuke Iwanaga

Page 2: 20160715 Using Mesos Schedulers with Amazon …...2016/07/14  · Amazon ECS: Scheduling • Each scheduler allocates tasks on the cluster • Each scheduler updates the current cluster

Agenda

• What's the scheduler for Amazon ECS• Introduction of Apache Mesos and its schedulers• How the ECS Scheduler Driver works• Demo• Case studies for custom schedulers

Page 3: 20160715 Using Mesos Schedulers with Amazon …...2016/07/14  · Amazon ECS: Scheduling • Each scheduler allocates tasks on the cluster • Each scheduler updates the current cluster

Amazon EC2 Container Service (ECS)

Container Managementat Any Scale

Flexible ContainerPlacement

Integrationwith the AWS Platform

Page 4: 20160715 Using Mesos Schedulers with Amazon …...2016/07/14  · Amazon ECS: Scheduling • Each scheduler allocates tasks on the cluster • Each scheduler updates the current cluster

Components of Amazon ECS

TaskActual containers running on Instances

Task DefinitionDefinition of containers and environment for task

ClusterFleet of EC2 instances on which tasks run

ManagerManage cluster resource and state of tasks

SchedulerPlace tasks considering cluster status

AgentCoordinate EC2 instances and Manager

Page 5: 20160715 Using Mesos Schedulers with Amazon …...2016/07/14  · Amazon ECS: Scheduling • Each scheduler allocates tasks on the cluster • Each scheduler updates the current cluster

How Amazon ECS runs Task Scheduler

ManagerCluster

Task Definition

Task

Agent

Page 6: 20160715 Using Mesos Schedulers with Amazon …...2016/07/14  · Amazon ECS: Scheduling • Each scheduler allocates tasks on the cluster • Each scheduler updates the current cluster

Schedulers

Page 7: 20160715 Using Mesos Schedulers with Amazon …...2016/07/14  · Amazon ECS: Scheduling • Each scheduler allocates tasks on the cluster • Each scheduler updates the current cluster

What is a Scheduler?

• Determine desired state

• Check against current state

• Perform action

Page 8: 20160715 Using Mesos Schedulers with Amazon …...2016/07/14  · Amazon ECS: Scheduling • Each scheduler allocates tasks on the cluster • Each scheduler updates the current cluster

Optimistic shared state model

Page 9: 20160715 Using Mesos Schedulers with Amazon …...2016/07/14  · Amazon ECS: Scheduling • Each scheduler allocates tasks on the cluster • Each scheduler updates the current cluster

Source: eurosys2013.tudos.org/wp-content/uploads/2013/paper/Schwarzkopf.pdf

Page 10: 20160715 Using Mesos Schedulers with Amazon …...2016/07/14  · Amazon ECS: Scheduling • Each scheduler allocates tasks on the cluster • Each scheduler updates the current cluster

Amazon ECS: Scheduling• Each scheduler periodically queries the current cluster state

Copy of cluster stateScheduler A Scheduler B

Cluster

Page 11: 20160715 Using Mesos Schedulers with Amazon …...2016/07/14  · Amazon ECS: Scheduling • Each scheduler allocates tasks on the cluster • Each scheduler updates the current cluster

Amazon ECS: Scheduling• Each scheduler allocates tasks on the cluster• Each scheduler updates the current cluster state

Run a taskRun a task

Page 12: 20160715 Using Mesos Schedulers with Amazon …...2016/07/14  · Amazon ECS: Scheduling • Each scheduler allocates tasks on the cluster • Each scheduler updates the current cluster

Amazon ECS: Scheduling• If the resource is already claimed, the request will be rejected

Run a task on the same resource=> Transactional

Page 13: 20160715 Using Mesos Schedulers with Amazon …...2016/07/14  · Amazon ECS: Scheduling • Each scheduler allocates tasks on the cluster • Each scheduler updates the current cluster

Amazon ECS: Scheduling

• Shared state optimistic scheduling• All schedulers can see the current cluster state at all times

Page 14: 20160715 Using Mesos Schedulers with Amazon …...2016/07/14  · Amazon ECS: Scheduling • Each scheduler allocates tasks on the cluster • Each scheduler updates the current cluster

Key points for ECS Schedulers and Manager

Good isolation between Schedulers and ManagerSingle resource manager – Multiple schedulersScalable & faster

ECS built-in SchedulersRun Task: Batch jobService: Long running application

Page 15: 20160715 Using Mesos Schedulers with Amazon …...2016/07/14  · Amazon ECS: Scheduling • Each scheduler allocates tasks on the cluster • Each scheduler updates the current cluster

Sometime you need Custom schedulers

Different resource, for specific task

Priority Queue

Want to have custom auto scaling

=> ECS allows you to add any custom schedulers in parallel

Page 16: 20160715 Using Mesos Schedulers with Amazon …...2016/07/14  · Amazon ECS: Scheduling • Each scheduler allocates tasks on the cluster • Each scheduler updates the current cluster

Apache Mesos

http://mesos.apache.org/

Page 17: 20160715 Using Mesos Schedulers with Amazon …...2016/07/14  · Amazon ECS: Scheduling • Each scheduler allocates tasks on the cluster • Each scheduler updates the current cluster

What's Apache Mesos?

Started at UC Berkeley

High utilization for diverse frameworks; Hadoop, MPI, etc.

Resource Offering

Page 18: 20160715 Using Mesos Schedulers with Amazon …...2016/07/14  · Amazon ECS: Scheduling • Each scheduler allocates tasks on the cluster • Each scheduler updates the current cluster

Mesos Architecture

http://mesos.apache.org/documentation/latest/architecture/

Page 19: 20160715 Using Mesos Schedulers with Amazon …...2016/07/14  · Amazon ECS: Scheduling • Each scheduler allocates tasks on the cluster • Each scheduler updates the current cluster

Resource Offering

http://mesos.apache.org/documentation/latest/architecture/

Page 20: 20160715 Using Mesos Schedulers with Amazon …...2016/07/14  · Amazon ECS: Scheduling • Each scheduler allocates tasks on the cluster • Each scheduler updates the current cluster

Mesos Schedulers

Marathon: for Long running application

Chronos: for Batch job with dependency

Apache Aurora: for Both long running and batch, as well as quota, DSL, etc.

Page 21: 20160715 Using Mesos Schedulers with Amazon …...2016/07/14  · Amazon ECS: Scheduling • Each scheduler allocates tasks on the cluster • Each scheduler updates the current cluster

How Marathon works

MasterSlave

Task

Marathon

Mesos Scheduler Driver

Page 22: 20160715 Using Mesos Schedulers with Amazon …...2016/07/14  · Amazon ECS: Scheduling • Each scheduler allocates tasks on the cluster • Each scheduler updates the current cluster

ECS Scheduler Driver

https://github.com/awslabs/ecs-mesos-scheduler-driver

Page 23: 20160715 Using Mesos Schedulers with Amazon …...2016/07/14  · Amazon ECS: Scheduling • Each scheduler allocates tasks on the cluster • Each scheduler updates the current cluster

ECS Scheduler Driver

Driver which can talk to ECS from any Mesos Scheduler

Written in Java

Without Mesos cluster

Control ECS cluster by Marathon, Chronos, etc.

Page 24: 20160715 Using Mesos Schedulers with Amazon …...2016/07/14  · Amazon ECS: Scheduling • Each scheduler allocates tasks on the cluster • Each scheduler updates the current cluster

ECS Scheduler Driver with Marathon

ManagerCluster

Task

Agent

Marathon

ECS Scheduler Driver

AWS SDK for Java

Amazon ECS

Page 25: 20160715 Using Mesos Schedulers with Amazon …...2016/07/14  · Amazon ECS: Scheduling • Each scheduler allocates tasks on the cluster • Each scheduler updates the current cluster

Inside of ECS Scheduler Driver

To create resource offers:List / DescribeContainerInstances

To launch/kill tasks:Start / StopTask

To poll task status:List / DescribeTasks

AWS SDK for Java

Page 26: 20160715 Using Mesos Schedulers with Amazon …...2016/07/14  · Amazon ECS: Scheduling • Each scheduler allocates tasks on the cluster • Each scheduler updates the current cluster

How to use it

Clone the repository and build it to JAR file

Modify the scheduler to use ECSSchedulerDriver class instead of MesosSchedulerDriver

Or, override MesosSchedulerDriver class at runtime to ECSSchedulerDriver by bytecode enhancement

Page 27: 20160715 Using Mesos Schedulers with Amazon …...2016/07/14  · Amazon ECS: Scheduling • Each scheduler allocates tasks on the cluster • Each scheduler updates the current cluster

Use case

Same functionality between AWS and On-premise

Easy to get leverage from Third party schedulerse.g. Marathon, Chronos, Aurora, etc.

Page 28: 20160715 Using Mesos Schedulers with Amazon …...2016/07/14  · Amazon ECS: Scheduling • Each scheduler allocates tasks on the cluster • Each scheduler updates the current cluster

Demo

Page 29: 20160715 Using Mesos Schedulers with Amazon …...2016/07/14  · Amazon ECS: Scheduling • Each scheduler allocates tasks on the cluster • Each scheduler updates the current cluster

Demo: ECS Scheduler Driver with Marathon

ManagerCluster

Task

Agent

Marathon

ECS Scheduler Driver

Amazon ECS

Marathon LB

Page 30: 20160715 Using Mesos Schedulers with Amazon …...2016/07/14  · Amazon ECS: Scheduling • Each scheduler allocates tasks on the cluster • Each scheduler updates the current cluster

Write your scheduler for ECS

https://aws.amazon.com/blogs/compute/how-to-create-a-custom-scheduler-for-amazon-ecs/

Page 31: 20160715 Using Mesos Schedulers with Amazon …...2016/07/14  · Amazon ECS: Scheduling • Each scheduler allocates tasks on the cluster • Each scheduler updates the current cluster

Example: Start Task on the least task running instance

def getInstanceArns(clusterName):

containerInstancesArns = []

# Get instances in the cluster

response = ecs.list_container_instances(cluster=clusterName)

containerInstancesArns.extend(response['containerInstanceArns'])

# If there are more instances, keep retrieving them

while response.get('nextToken', None) is not None:

response = ecs.list_container_instances(

cluster=clusterName,

nextToken=response['nextToken']

)

containerInstancesArns.extend(response['containerInstanceArns'])

return containerInstancesArns

Page 32: 20160715 Using Mesos Schedulers with Amazon …...2016/07/14  · Amazon ECS: Scheduling • Each scheduler allocates tasks on the cluster • Each scheduler updates the current cluster

Example: Start Task on the least task running instance

def startTask(clusterName, taskDefinition):

startOn = []

# Describe all instances in the ECS cluster

containerInstancesArns = getInstanceArns(clusterName)

response = ecs.describe_container_instances(

cluster=clusterName,

containerInstances=containerInstancesArns

)

containerInstances = response['containerInstances']

Page 33: 20160715 Using Mesos Schedulers with Amazon …...2016/07/14  · Amazon ECS: Scheduling • Each scheduler allocates tasks on the cluster • Each scheduler updates the current cluster

Example: Start Task on the least task running instance

# Sort instances by number of running tasks

sortedContainerInstances = sorted(

containerInstances,

key=lambda containerInstances: containerInstances['runningTasksCount']

)

# Get the instance with the least number of tasks

startOn.append(sortedContainerInstances[0]['containerInstanceArn'])

# Start a new task

response = ecs.start_task(

cluster=clusterName,

taskDefinition=taskDefinition,

containerInstances=startOn,

startedBy='LeastTasksScheduler'

)

Page 34: 20160715 Using Mesos Schedulers with Amazon …...2016/07/14  · Amazon ECS: Scheduling • Each scheduler allocates tasks on the cluster • Each scheduler updates the current cluster

Case studiesof custom schedulers

Page 35: 20160715 Using Mesos Schedulers with Amazon …...2016/07/14  · Amazon ECS: Scheduling • Each scheduler allocates tasks on the cluster • Each scheduler updates the current cluster

Amazon Personalization

Generating Recommendations at Amazon Scale with Apache Spark and Amazon DSSTNEhttps://blogs.aws.amazon.com/bigdata/post/TxGEL8IJ0CAXTK/Generating-Recommendations-at-Amazon-Scale-with-Apache-Spark-and-Amazon-DSSTNE

Page 36: 20160715 Using Mesos Schedulers with Amazon …...2016/07/14  · Amazon ECS: Scheduling • Each scheduler allocates tasks on the cluster • Each scheduler updates the current cluster

Recommendation for products at Amazon scale

CPU for data analytics, preparation=> Apache Spark on Amazon EMR

GPU for train and predict neural network model=> Amazon DSSTNE on Amazon ECS

Notebook for common entry point=> Zeppelin on Amazon EMR

Page 37: 20160715 Using Mesos Schedulers with Amazon …...2016/07/14  · Amazon ECS: Scheduling • Each scheduler allocates tasks on the cluster • Each scheduler updates the current cluster

Architecture overview

Page 38: 20160715 Using Mesos Schedulers with Amazon …...2016/07/14  · Amazon ECS: Scheduling • Each scheduler allocates tasks on the cluster • Each scheduler updates the current cluster

Using ECS Scheduler Driver

To support task queue for GPU jobs

Current ECS does not have the native queue system

Leverage Mesos schedulers to get this easily

Page 39: 20160715 Using Mesos Schedulers with Amazon …...2016/07/14  · Amazon ECS: Scheduling • Each scheduler allocates tasks on the cluster • Each scheduler updates the current cluster

Coursera

http://www.slideshare.net/AmazonWebServices/cmp406-amazon-ecs-at-coursera-a-generalpurpose-microservicehttps://www.youtube.com/watch?v=a45J6xAGUvA

Page 40: 20160715 Using Mesos Schedulers with Amazon …...2016/07/14  · Amazon ECS: Scheduling • Each scheduler allocates tasks on the cluster • Each scheduler updates the current cluster
Page 41: 20160715 Using Mesos Schedulers with Amazon …...2016/07/14  · Amazon ECS: Scheduling • Each scheduler allocates tasks on the cluster • Each scheduler updates the current cluster

What We Built: Iguazú

Marissa Strniste (https://www.flickr.com/photos/mstrniste/5999464924) CC-BY-2.0

• Batch Job Scheduler for Amazon ECS• Immediately• Deferred (run once at X time)• Scheduled recurring (cron-like)

• Programmatically accessible internally viaour standard APIs and clients

• Named for Iguazú falls• World’s largest waterfall by volume• We hope Iguazú handles a similar volume of jobs

Page 42: 20160715 Using Mesos Schedulers with Amazon …...2016/07/14  · Amazon ECS: Scheduling • Each scheduler allocates tasks on the cluster • Each scheduler updates the current cluster

Iguazú Frontend

Iguazú Scheduler

Iguazú Backend

Iguazú: Architecture

CassandraServices Services

Iguazú Admin

ECS Workers

SQS

ECS API

Devs

Users

Page 43: 20160715 Using Mesos Schedulers with Amazon …...2016/07/14  · Amazon ECS: Scheduling • Each scheduler allocates tasks on the cluster • Each scheduler updates the current cluster

Iguazú: Developer / Ops User Interface

Page 44: 20160715 Using Mesos Schedulers with Amazon …...2016/07/14  · Amazon ECS: Scheduling • Each scheduler allocates tasks on the cluster • Each scheduler updates the current cluster

Programming Assignments at Coursera

Page 45: 20160715 Using Mesos Schedulers with Amazon …...2016/07/14  · Amazon ECS: Scheduling • Each scheduler allocates tasks on the cluster • Each scheduler updates the current cluster

What We Built: GrID

Patrick Hoesly (https://www.flickr.com/photos/zooboing/5665221326/) CC-BY-2.0

• Service + architecture for gradingprogramming assignments

• Builds on Amazon ECS and Iguazú

• Named for Tron’s “digital frontier”• Backronym: Grading Inside Docker

Page 46: 20160715 Using Mesos Schedulers with Amazon …...2016/07/14  · Amazon ECS: Scheduling • Each scheduler allocates tasks on the cluster • Each scheduler updates the current cluster

High-level GrID Architecture

Learners

GrID

Iguazú

S3 Bucket

ECS API

Grading Machines

VPC Firewalls

Production Acct GrID Grading Account

Page 47: 20160715 Using Mesos Schedulers with Amazon …...2016/07/14  · Amazon ECS: Scheduling • Each scheduler allocates tasks on the cluster • Each scheduler updates the current cluster

One more thing…

Page 48: 20160715 Using Mesos Schedulers with Amazon …...2016/07/14  · Amazon ECS: Scheduling • Each scheduler allocates tasks on the cluster • Each scheduler updates the current cluster

Collect Container Logs via awslogs

Amazon CloudWatch Logs

Amazon CloudWatch Logs

Amazon CloudWatch Logs

Amazon CloudWatch Logs

Amazon S3

Amazon Kinesis

AWS Lambda

Amazon Elasticsearch Service

Amazon ECS Store

Stream

Process

Search

Page 49: 20160715 Using Mesos Schedulers with Amazon …...2016/07/14  · Amazon ECS: Scheduling • Each scheduler allocates tasks on the cluster • Each scheduler updates the current cluster

Collect Container Logs via awslogs

• Supported Docker Logging Drivers• json-file, syslog, journald, gelf, fluentd, splunk, awslogs• stdout/stderr outputs are automatically sent by the driver

• awslogs sends logs to Amazon CloudWatch Logs• Log group for a specific service• Log stream for each container

• Amazon CloudWatch Logs => Other services• Search, Filter, Export to Amazon S3, Send to Amazon Kinesis, AWS

Lambda, Amazon Elasticsearch Service

Page 50: 20160715 Using Mesos Schedulers with Amazon …...2016/07/14  · Amazon ECS: Scheduling • Each scheduler allocates tasks on the cluster • Each scheduler updates the current cluster

Search logs with Kibana on Amazon ES

Page 51: 20160715 Using Mesos Schedulers with Amazon …...2016/07/14  · Amazon ECS: Scheduling • Each scheduler allocates tasks on the cluster • Each scheduler updates the current cluster

Conclusion

Amazon ECS is managed cluster management service

ECS exposes all cluster state so that any type of schedulers can easily control it in parallel

via List / Describe * APIs

ECS Scheduler Driver is one of reference implementations

Page 52: 20160715 Using Mesos Schedulers with Amazon …...2016/07/14  · Amazon ECS: Scheduling • Each scheduler allocates tasks on the cluster • Each scheduler updates the current cluster

Thank you!