AWS re:Invent 2016: Best practices for running enterprise workloads on AWS (ENT213)

Post on 16-Apr-2017

298 views 0 download

Transcript of AWS re:Invent 2016: Best practices for running enterprise workloads on AWS (ENT213)

© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Amit Sharma, AWS

Anthony Nicholls, Jharrod LaFon, Craig Bruce, OpenEye Scientific Software, Inc.

November 30, 2016

ENT213

Best Practices for Running Enterprise

Workloads on AWS

Agenda

1. AWS and Life Sciences

2. OpenEye Scientific

1. Use case

2. Demo

3. Learning

3. Q&A

Largest pace

of innovation

Partner and

customer ecosystem

Longest industry

experience

Why AWS?

Discovery

Manufacturing

and Distribution

Development

Marketing and

Sales

Computational chemistry

Collaboration

Genomics

Pharmacovigilance

Pharmacokinetics

Clinical Trials Management

Supplier collaboration

Quality management

Processing analytics

Digital marketing

Online storefronts

Content distribution

Security is foundational at AWS

Architected to be one of the most flexible and secure cloud

computing environments available today

You retain ownership of your IP and content – AWS does not have access

You control region(s) where your data is stored

You can build end-to-end compliance, including HIPAA compliance

AWS data centers always “on”; robust connectivity and bandwidth

Ongoing audit and assurance program

Industry certificationsAWS secures the

infrastructure....

....so you can

secure your data

Security: A shared responsibility

The AWS Cloud Improves Your Compliance

Posture

Controllable Infrastructure Repeatable Testing Automatic Traceability

Creating the Nimble Life Sciences Enterprise

Bring agility to your business

Add efficiency throughout the value chain

Analytics to tackle any business problem

Collaborate globally throughout your organization

What to Expect from the Session

1) An appreciation of the computational problems faced

by the pharmaceutical Industry

2) To learn how Orion, our cloud-native platform, uses

AWS to address these problems

3) To see how generalizations of our approach can apply

to your organization

Outline

1) The Modern Pharmaceutical Industry

2) OpenEye- Why We Think We (and AWS) Can Help

3) Problem Solving with AWS and Orion

4) An Orion Demo

5) Orion, Under the Hood

6) Lessons Learned & Generalizations

© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

The Modern

Pharmaceutical Industry

The Inverse Moore’s Law of Pharmaceuticals

“Classic” Law of Diminishing Returns

Texas Oil Production

New Technology

Where is the new technology for Pharma?

© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

OpenEyeWhy We Think We (and AWS) Can Help

OpenEye Scientific Software- Inc. 1997, Santa Fe

2016: 50 employees- Santa Fe, Boston, Cologne, Tokyo

• First-in-class software for Molecular Modeling / CADD:

• Large Scale Virtual Screening, Cheminformatics, Software Toolkits

• Trusted brand for Science & Computer Science

• Deployed to: 19/20 Top Big Pharma

• Deep knowledge of institutional problems

Computational Problems

1) Multidisciplinary “silos”

- Data, knowledge, methods

2) Computation and data scaling

3) Retaining data context

4) Security vs. need for collaborations vs.

106 , 109 ,1012

© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Problem Solving with AWS

and Orion

Problems addressed by Orion and AWS

1) Easy authoring, publishing and versioning solutions

2) Automation & scaling

3) Data sharing via “change & notify” paradigm

4) Unlimited data & contexts

5) Collaborative workspaces

6) AWS Security

106 , 109 ,1012

Democratization of computation

© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Jharrod LaFon, Chief Cloud Engineer

An Orion Demo

@jharrodlafon

Overview

1) Example problem solved with Orion & AWS

2) Capture of science and information necessary for

collaborative, automated, scalable, scientific workflows

106 , 109 ,1012

Problem: Finding new molecules (drugs)

• Start with a known ‘active’ molecule (ligand): one with

desirable biological properties

• Find biologically similar molecules from a database, but

with different chemical structure

• Known as ‘Ligand Based Lead Discovery’

106 , 109 ,1012

Problem: Finding new molecules (drugs)

Patented molecule 3D Overlay Patentable molecule

DEMO

Example Virtual Screening Workflow Pfizer

Intends to Deploy in Orion

• Currently possible to do all of this with in-house resources.

Very manual, very time consuming (both setup and calculation)

• Goal is to reduce this to setup of under 1 minute and run time

of under 15 minutes…as easy as a substructure search

Pfizer Proprietary – Not for Distribution

eMolecules

(5M compounds)

Similarity

search

2D fingerprint-based

FastROCS

GPU Shape-based

ROCS

Shape & Color

EON

Shape &

Electrostatics

Final

Hitlist

AWS Storage

AWS GPUs

AWS CPUs

AWS CPUs

and Storage

AWS CPUs

Query

Internal

Docking

Algorithm

Docking

AWS CPUs

Ligand Based Lead Discovery Workflow

Orion Workflows with Floe on AWS

• Composed of small, reusable components (Cubes)

• Cubes are defined by a few lines of Python

• Runs on automated Docker container infrastructure in

AWS

• Built in parallelism, scales to 1000s of CPUs

Workflow Lifecycle

• An expert designs and builds the workflow

• Once ready, the workflow is published so that

others may use it

• Built-in scheduler automates & scales all

necessary infrastructure 106 , 109 ,1012

DEMO

Merck: Compute and Storage to Support

Protein Design

Goal: Protein

interaction to design or

optimize

Enumeration

Large virtual library of

synthetically accessible

amino acids

AWS Storage

AWS Compute

Tasks continuously

mining from literature and

patents

Tasks continuously

mining from internal data

sources

AWS Push Alerts for new reagents of interest

Mixed QSAR + empirical

interaction filtering and

ranking

AWS Compute

Filter and Rank

Molecular dynamics

stability evaluation

Free energy perturbation

energetic evaluation

AWS Compute

AWS Storage++

Evaluate

Protein design Floe using a mix of OpenEye, third party and in house methods

Orion Analysis Tools

Analysis and decision

making done in the context

of aggregated project

knowledge

Ag

gre

ga

te

© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Craig Bruce, Head of DevOps

Orion, Under the Hood

@craigbruce

Architecture Diagram

VPC

Direct Connect

Management tools

CloudFormation

• Blueprint for all your AWS

resources

• Reproducible

• Updatable

• Sharable

Compute & Application Services

Compute

EC2

• Multiple instance types

• Including GPUs

• Spot

• Auto Scaling groups

• CloudWatch

• Metrics with alarms 106 , 109 ,1012

Compute & Application Services

106 , 109 ,1012

Compute & Application Services

106 , 109 ,1012

Storage & Databases

Storage & Databases

S3

• Large datasets

• Unlimited storage

• Archiving

RDS

• Amazon Aurora

ElastiCache

• Redis

• In-memory data store106 , 109 ,1012

Developer Tools

CodeDeploy & CodePipeline

• Automate deployments

• Integrated with our CI/CD solution

• Deploys to a real stack

• Runs Browser tests via

Development

• Boto (Boto3)

• Configuration management

Plus many others

• IAM

• KMS (EBS, RDS & S3)

• Lambda

• SES

• Route 53

• CloudTrail

• Evaluating others (ACM, ECS)

OpenEye Hosted

• OpenEye account/VPC

• No access to customer

datacenter

• Multi-tenancy

• EC2 shared tenancy

• OpenEye administration and

support

Hosting options

Customer Hosted

• Customer account/VPC

• Access to your datacenters

• Single tenancy

• EC2 shared/dedicated/host

tenancy

• Customer administration and

support

A platform requires this many pieces

• A small team could not have built Orion without AWS

• AWS services are continually more enterprise-friendly

• Great individually, very powerful together

• Enables startup agility at scale, every day:

• Hundreds of deployments

• Thousands of workflows

• Millions of messages

• Always automate106 , 109 ,1012

6) Lessons Learned

Lessons Learned

CloudTrail CloudFormation CodeDeploy EC2

6) Generalizations

106 , 109 ,1012

Automation and Scaling

Expert

Automation

Sca

ling

HighLow

High

Low

Guru

Novice

High-functioning

Novice

Empowering

Productive

Organization

106 , 109 ,1012

Central resource

Individuals

Accessibility

Sca

le

HighLow

High

Low

Silos

Community

Enterprise

Productive

Organization

Workflow Design

Hackers

Ease-of-use

Po

we

r

HighLow

High

Low

Gurus

Community

Revolution

Innovation

Best Practices

Ownership

106 , 109 ,1012

Confluence & Synergy

1) Compute

2) Analyze

3) Share

4) Develop

Change Your World

p.s. Come change the World with us!

Visit our booth #2235 to try Orion

Hiring:

1) Cloud Engineers

2) DevOps Engineers

3) Python experts

4) Scientific programmers

Thank you!

Remember to complete

your evaluations!