Agenda

32
© 2014 IBM Corporation Platform Computing 1 IBM Platform Computing Cloud Service Ready to use Platform LSF & Symphony clusters in the SoftLayer cloud February 25, 2014

description

IBM Platform Computing Cloud Service Ready to use Platform LSF & Symphony clusters in the SoftLayer cloud February 25, 2014. Mapping clients needs to cloud technologies Addressing your pain points Introducing IBM Platform Computing Cloud Service - PowerPoint PPT Presentation

Transcript of Agenda

Page 1: Agenda

© 2014 IBM Corporation

Platform Computing

1

IBM Platform Computing Cloud Service Ready to use Platform LSF & Symphony clusters in the SoftLayer cloud

February 25, 2014

Page 2: Agenda

© 2014 IBM Corporation

Platform Computing

2

Agenda

Mapping clients needs to cloud technologies

Addressing your pain points

Introducing IBM Platform Computing Cloud Service

Product features and benefits

Use cases

Performance benchmarks

Page 3: Agenda

© 2014 IBM Corporation

Platform Computing

3

HPC cloud characteristics and economics are different than general-purpose computing

• High-end hardware and special purpose devices (e.g. GPUs) are typically used to supply the needed processing, memory, network, and storage capabilities

• The performance requirements of technical computing and service-oriented workloads means that performance may be impacted in a virtualized cloud environment, especially when latency or I/O is a constraint

• HPC cluster/grid utilization is usually in the 70-90% range, removing a major potential advantage of a public cloud service provider for stable workload volumes

HPC Workloads Recommended for Private Cloud

HPC Workloads with Best Potential for Virtualized Public & Hybrid Cloud

Primary HPC Workloads

Page 4: Agenda

© 2014 IBM Corporation

Platform Computing

4

IBM’s HPC cloud strategy provides a flexible approach to address a variety of client needs

Evolve existing infrastructure to

HPC Cloud to enhance responsiveness,

flexibility, andcost effectiveness.

Enable integrated approach to improve

HPC cost and capability 60%

Access additional HPC capacity with

variable cost model

Private

Clouds

HybridClouds

PublicClouds

Based on HPC Cloud’s potential impact, organizations are evolving their infrastructures to enable private cloud deployments, exploring hybrid clouds, and considering public clouds.

Page 5: Agenda

© 2014 IBM Corporation

Platform Computing

5

Are you experiencing any of these pain points?

• Unable to meet business objectives (delay to market, etc.)• Existing resources insufficient to meet peek compute demand

– Long run times on existing cluster or grid– No access to local technical computing resources (workstation users)

• Technical resources expensive and time consuming to acquire • The skills/staff to architect and manage a technical computing infrastructure can

be difficult to acquire

1 3 5 7 9 11 13 15 17 19 21 23 -

10,000

20,000

30,000

40,000

50,000

Planned Daily Cycle (24 x 365)

Financial Services

April

April

April

April

April

May

May

May

May

June

June

June

June

0200400600800

1000120014001600

Planned Project

Life Sciences

Page 6: Agenda

© 2014 IBM Corporation

Platform Computing

6

IBM Platform Computing Cloud ServiceMaking the cloud work for you

Build• Complete, ready

to run clusters in the cloud

• Add additional capacity in hours instead of months

Manage• Seamless

workload management, on-premise and in the cloud

• Transparent user experience

Support• 24X7 cloud

operation support• Access to

technical computing expertise when you need it

Protect• Data encryption,

dedicated physical machines and network

• Security through physical isolation

Complete, end to end dynamic cloud solution

Page 7: Agenda

© 2014 IBM Corporation

Platform Computing

7

Ready to use Platform LSF & Platform Symphony clusters in the cloud

IBM Platform Computing Cloud Service (SaaS)

IBM Platform LSFIBM Platform Symphony

SoftLayer, an IBM CompanyInfrastructure

24X7 CloudOps Support

Client and ISV Applications

Page 8: Agenda

© 2014 IBM Corporation

Platform Computing

8

Dedicated physical and virtual machine infrastructure as a service

• 13+ data centers• 17 network PoPs• Global private network• Bare metal and virtual

machines

190,000+SERVERS

21,000+CUSTOMERS

22,000,000+DOMAIN

S

Page 9: Agenda

© 2014 IBM Corporation

Platform Computing

9

Workload I/O intensity

• SoftLayer’s architecture outperforms by >50% equivalent AWS instances for high I/O workloads

Control (APIs, hardware / network configurability)

• SoftLayer offers hundreds of hardware configurations vs. 14 for AWS

• ~2,000 APIs for SoftLayer vs. ~60 for AWS and none for RAX

Integrated platform of multiple architectures

• Unified integration & control panel for multiple cloud architectures

• RAX requires paid bridge, different control interfaces

Ready to use Platform LSF & Platform Symphony clusters in the cloud

Low intensity

workloads

Low degree of control and

customization

AWS IBM

High intensity

workloads

High degree of control and

customization

Single platform

Seamless integration

DIFFERENTIATOR RATING IBM ADVANTAGES

RAX

Page 10: Agenda

© 2014 IBM Corporation

Platform Computing

10

Non-shared physical machines for added security and performance

• Dedicated and isolated compute environment

• All machine instances are dedicated to the client

• Each cluster is isolated on a VLAN

• Only the VPN gateway has an addressable interface

• All customer data at rest is encrypted on shared file systems

• When machines instances are decommissioned the disks are scrubbed using DoD approved methods

Page 11: Agenda

© 2014 IBM Corporation

Platform Computing

11

Optimal performance for technical computing apps

Industrial Manufacturing Benchmark – Structural Mechanics

EDA Benchmark (IBM-MESA)

Note: Benchmark results were obtained by IBM and have not yet been externally audited or validated.

Page 12: Agenda

© 2014 IBM Corporation

Platform Computing

12

Run and supported by dedicated, 24X7 HPC Cloud Operations Team

CloudOps functions• Pre-provisioning: Provide guidance to client on how to enable VPN, multi-cluster settings &

security settings on the client on-premise environment• One time setup testing: Extensive testing of the cluster prior to release to the client• Extensive testing of the cluster on every event of flex-up prior to release to the client• Email alerts prior to flex-down & cluster shutdown operations• Email alerts in case of any overage (compute hours, download bandwidth)• Provide billing details of monthly usage including overage details• Provide support under IBM SLA by experts highly experienced in Platform Computing

products

Value: quality, peace of mind & minimum disruption to business• Extensive quality checks ensures minimum loss of usage hours & disruptions• Proactive alerts ensures that in-progress critical jobs are not killed in case of Flex-down &

Cluster Shutdowns and Overages• Highly trained & experienced Support ensures smooth on-boarding and minimize

disruptions

Page 13: Agenda

© 2014 IBM Corporation

Platform Computing

13

Industry-leading workload management

• 20 years managing distributed scale-out systems with 2000+ customers in many industries

• High performance workload management combined with intelligent resource scheduling engine

• Unmatched scalability (small clusters to global grids) and production-proven reliability

• Heterogeneous – manages System x and Power plus 3rd party systems, virtual and bare metal, accelerators / GPU, cloud, etc.

• Shared services for both compute and data intensive workloads

• Integrated solutions with vertical reference architectures

23 of 30 largest

commercial enterprises

Over 5M CPUs under management

60% of top financial services

companies

Page 14: Agenda

© 2014 IBM Corporation

Platform Computing

14

IBM Platform LSFOverview

Powerful workload management for demanding, distributed and mission-critical high performance computing environments.

Key Capabilities• Powerful

- Policy and resource-aware scheduling- Resource consolidation for optimal performance- Advanced self-management

• Flexible- Heterogeneous platform support- Policy-driven automation- CLI, web services, APIs

• Scalable- Thousands of concurrent users and jobs- Virtualized pool of shared resources- Flexible control, multiple policies

Client Benefits• Optimal utilization: reduced infrastructure cost• Robust capabilities: improved productivity• High throughput: faster time to results

14

Page 15: Agenda

© 2014 IBM Corporation

Platform Computing

15

IBM Platform Symphony

Overview

Low-latency grid management platform for distributed computing and analytics with sophisticated resource sharing

Key Capabilities• Accelerates service-oriented applications• Extreme app scalability and throughput with very low

latency• Compute and data-intensive applications on a single

platform• Sophisticated, hierarchical resource sharing• Open and flexible: choice of OS, frameworks and

languages

Client Benefits• Increase performance and analytic result quality• Reduces IT costs - increase utilization, simplify

application onboarding, reduce administration costs

Low Latency / High throughputSub-millisecond, 17,000 tasks per second

Large Scale10k cores per application, 40k cores per grid

Efficient shared services

Heterogeneous & OpenLinux, Windows, AIX, C/C++, C#, Java, Excel, Python, R

15

Page 16: Agenda

© 2014 IBM Corporation

Platform Computing

16

Use case 1 – hybrid cluster

The problem • Existing resources cannot meet peak demand• Resources are expensive and time consuming to acquire • Skills to architect and manage clusters are difficult to find• Fixed or reduced budgets• On-premise constraints in space, cooling and power

The solution • Fully functioning IBM Platform LSF or Symphony clusters

are provisioned on the SoftLayer cloud and connected to the on-premise cluster, expanding capacity as needed

• Leverage MultiCluster capability for managed forwarding of jobs from on premise cluster to off premise cluster

The Value• Access to additional compute capacity on a temporary basis as needed

• Near-zero wait times

• Reduce costs by paying for only what is used

• Pay for additional capacity as an operating expense

• Fully supported, end-to-end solution, from the on-premise to the on-cloud clusters

• Expected and reliable performance from running technical computing workloads on physical machines

• Transparent access to cloud resources, the end user experience does not change

Page 17: Agenda

© 2014 IBM Corporation

Platform Computing

17

Use case 2 – stand-alone cluster in the cloud

The problem • New and emerging need for technical computing• Skills to architect and manage clusters are difficult to find• Resources are expensive and time consuming to acquire • Inconsistent demand does not justify the investment

The solution • Fully functioning Platform LSF and Symphony clusters are

provisioned on the SoftLayer cloud providing resources as needed

The valueMarket-leading Platform LSF and Platform Symphony software

Access to technical computing resources on a temporary basis without the need to acquire, install and configure the infrastructure and cluster software

Keep costs low by paying for only what is used

Pay for capacity as an operating expense

Fully supported solution

Expected and reliable performance from running workloads on physical machines

Page 18: Agenda

© 2014 IBM Corporation

Platform Computing

18

Is IBM Platform Computing Cloud Service a good fit for you?

Business pain points• And you experiencing lost profit due to missed deadlines?• Do you experience pressure to convert your compute environment capital expense to

operational expense?• Have you ever missed a deadline or delayed a project because technical computing

resource procurement took too long ?

Technology pain points• Do your users ever scale back their analyses to lower fidelity or less accuracy in order to fit

them into the local compute environment or to a time window?• Do you regularly, occasionally, or permanently have fewer resources (CPUs, disk, memory,

etc) than you would like to have to service the user’s compute demand?• Do you experience a large variance in compute resource utilization?• Have you reached, or will you reach the capacity of your datacenter(s), and do you need a

plan to grow beyond that capacity ?• Are your customers asking you for cloud licenses for Platform LSF or Platform Symphony?

Page 19: Agenda

© 2014 IBM Corporation

Platform Computing

19

IBM Platform Computing Cloud ServiceMaking the Cloud Work for You

Unmatched ExpertiseAnalytics, Technical Computing,

Software, Services and ISV Partnerships

IBM Hybrid Cloud

ConsolidationSupporting heterogeneous IBM and non-IBM infrastructure

Cloud LeadershipExpertise from

Client Engagements

powered by

OnSmartCloud

Unmatched CapabilitiesPolicy-driven Workload

Management

OnPremise

Software & Systems

Page 20: Agenda

© 2014 IBM Corporation

Platform Computing

20

Thank You

Page 21: Agenda

© 2014 IBM Corporation

Platform Computing

21

SoftLayer and Amazon EC2 Products tested

NAMEIaaS

ProviderCPU Cores Memory

(GB)Disk Space

(GB)Physical /

VirtualHourly

Rate (USD)

SL PMSoftLayer 16 64 1000[1] Physical $1.85[2]

SL VMSoftLayer 8 8 500[3] Virtual $0.88

SL PM (ded)SoftLayer 16 64 1000[1] Physical $3.83[5]

EC2 CC2Amazon EC2 (CC2)

32 60.5 3360 Virtual $2.40[4]

EC2 2XL

Amazon EC2 (c1.xlarge)

8 7 840 Virtual $0.58

SL Physical Machine Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHzSL Physical Machine (dedicated) Intel® Xeon® CPU E5-2690 0 @ 2.90GHzSL Virtual Machine Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHzAmazon CCI2 Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHzAmazon 2XL Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz

Page 22: Agenda

© 2014 IBM Corporation

Platform Computing

22

Memory Bandwidth

SL PM SL VM EC2 CCI2 EC2 2XL SL PM (ded)0

1000

2000

3000

4000

5000

6000

7000

8000

9000

STREAM(higher is better)

COPY

SCALE

ADD

TRIAD

SL PM SL VM EC2 CCI2 EC2 2XL SL PM (ded)

0.00

500.00

1,000.00

1,500.00

2,000.00

2,500.00

3,000.00

3,500.00

4,000.00

4,500.00

STREAM Price Performance(higher is better)

COPY

SCALE

ADD

TRIAD

Page 23: Agenda

© 2014 IBM Corporation

Platform Computing

23

CPU Performance

SL PM SL VM EC2 CCI2 EC2 2XL SL PM (ded)0

100

200

300

400

500

600

700

800

SuperPI(lower is better)

Ela

pse

d T

ime

SL PM SL VM EC2 CCI2 EC2 2XL SL PM (ded)0.001.002.003.004.005.006.007.008.009.00

10.00

SuperPI Price-Performance(higher is better)

thro

ug

hp

ut

per

do

llar

Page 24: Agenda

© 2014 IBM Corporation

Platform Computing

24

Network Bandwidth

1 10 100 1000 10000 100000 1000000 100000001

10

100

1000

10000

100000openMPI

SLVM

EC2 2XL

EC2 CCI2

SL PM

SL PM Dedicated

Message Size (Bytes)

Ban

dw

idth

(M

bit

s/s)

Page 25: Agenda

© 2014 IBM Corporation

Platform Computing

25

Network Latency

SL VM MPI 2 node EC2 2XL MPI 2 node EC2 CCI2 MPI 2 node

SL PM MPI 2 node SL PM (ded) MPI 2 node

0

20

40

60

80

100

120

openMPI Latency(lower is better)

Page 26: Agenda

© 2014 IBM Corporation

Platform Computing

26

Input / Output Performance

0.5 1 1.5 2 2.5 3 3.5 4 4.50

50000

100000

150000

200000

250000

300000

350000

I/O Bandwidth - WRITE(higher is better)

SL VM Write

EC2 2XL Write

EC2 CCI2 Write

SL PM Write

SL PM Ded Write

I/O file size (factor of memory size)kB

/sec

0.5 1 1.5 2 2.5 3 3.5 4 4.50

50000

100000

150000

200000

250000

300000

350000

400000

I/O Bandwidth - READ(higher is better)

SL VM Read

EC2 CCI2 Read

EC2 2XL Read

SL PM Read

SL PM Ded Read

I/O file size (factor of memory size)

kB/s

ec

Page 27: Agenda

© 2014 IBM Corporation

Platform Computing

27

Software Compilation

SL VM SL PM EC2 2XL EC2 CCI SL PM Ded0

100

200

300

400

500

600

700

800

Software Compile Performance(lower is better)

Ela

pse

d T

ime

(s)

SL VM SL PM EC2 2XL EC2 CCI SL PM Ded0.00

1.00

2.00

3.00

4.00

5.00

6.00

7.00

8.00

9.00

Software Compile Price-Performance(higher is better)

Ru

ns

/ $

Page 28: Agenda

© 2014 IBM Corporation

Platform Computing

28

Life Science (BWA)

SL PM (ded) SL PM SL VM EC2 CCI2 EC2 2XL

Series1 20846.481 26509.368 25897.4399999999

22442.7 37491

2500

7500

12500

17500

22500

27500

32500

37500

Life Sciences Benchmark (BWA)(lower is better)

Ela

pse

d t

ime

(sec

)

SL PM (ded) SL PM SL VM EC2 CCI2 EC2 2XL

Series1 22.2056844293981

7.79121780185185

6.33048533333333

14.9618 6.04021666666667

2.50

7.50

12.50

17.50

22.50

Life Sciences Benchmark (BWA) Price Per-formance

(lower is better)

$ /

run

Page 29: Agenda

© 2014 IBM Corporation

Platform Computing

29

EDA Benchmark (IBM-MESA)

SL PM (ded) SL PM SL VM EC2 2XL EC2 CCI20

500

1000

1500

2000

2500

3000

3500

EDA - IBM Mesa(lower is better)

Ela

pse

d T

ime

(sec

)

SL PM (ded) SL PM SL VM EC2 2XL EC2 CCI20.00

0.50

1.00

1.50

2.00

2.50

EDA - IBM Mesa - Price-Performance(higher is better)

Ru

ns

/ $

Page 30: Agenda

© 2014 IBM Corporation

Platform Computing

30

Provisioning Time

SL PM SL VM EC2 CCI2 EC2 2XL SL PM Ded100

1000

10000

100000

Provisioning Time (sec)(lower is better)

Page 31: Agenda

© 2014 IBM Corporation

Platform Computing

31

Industrial Manufacturing – Structural Mechanics

0 2 4 6 8 10 12 14 161

3

5

7

9

11

13

One Node - S4D

SL PM

EC2 CCI2

SL VM

EC2 2XL

SL PM (ded)

CPUs

Sp

eed

up

(re

lati

ve t

o E

C2

2XL

)

0 2 4 6 8 10 12 14 161

2

3

4

5

6

7

One Node - S6

SL PM

EC2 CCI2

SL VM

EC2 2XL

SL PM (ded)

CPUs

Sp

eed

up

(re

lati

ve t

o E

C2

2XL

)

0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 3213579

1113151719

Two Nodes - S4D

SL PM

EC2 CCI2

SL VM

EC2 2XL

SL PM (ded)

CPUs

Sp

eed

up

(re

lati

ve t

o E

C2

2XL

)

0 4 8 12 16 20 24 28 321

2

3

4

5

6

7

8

9

Two Nodes - S6

SL PM

EC2 CCI2

SL VM

EC2 2XL

SL PM (ded)

CPUs

Sp

eed

up

(re

lati

ve t

o E

C2

2XL

)

Page 32: Agenda

© 2014 IBM Corporation

Platform Computing

32

Industrial Manufacturing – CFD

1 3 5 7 9 11 13 150

2

4

6

8

10

12

14

16

18

OpenFoam Speedup Backplane(higher is better)

SL PM (ded)

SL PM

SL VM

EC2 CCI2

EC2 2XL

# cores

Sp

eed

up

(re

lati

ve t

o E

C2

2XL

)

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 310

1

2

3

4

5

6

7

8

OpenFoam Speedup Ethernet(higher is better)

SL PM (ded)

SL PM

SL VM

EC2 CCI2

EC2 2XL

# cores

Sp

eed

up

(re

lati

ve t

o E

C2

2XL

)