Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.

70
Jason Stowe Jason Stowe Condor Week 2009 Condor Week 2009 April 22 April 22 nd nd , 2009 , 2009

Transcript of Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.

Page 1: Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.

Jason StoweJason Stowe

Condor Week 2009Condor Week 2009

April 22April 22ndnd, 2009, 2009

Page 2: Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.

Coming to Condor Week Coming to Condor Week since 2005. Started as a Usersince 2005. Started as a User

Page 3: Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.

Users hunger for featuresUsers hunger for features

Page 4: Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.

AccountingGroups (2004/2005)AccountingGroups (2004/2005)Configuration w/Pipes (2005/2006)Configuration w/Pipes (2005/2006)GroupResourcesUsed (2006/2007)GroupResourcesUsed (2006/2007)

Condor in Cloud (2007/2008)Condor in Cloud (2007/2008)Resource Weights (2008/2009)Resource Weights (2008/2009)

Based upon customer requestsBased upon customer requests

Page 5: Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.

Focus on software development for Focus on software development for managing Condor at any scale,managing Condor at any scale,

and provide services that and provide services that complement the technologycomplement the technology

Page 6: Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.

Universities, Fortune 500s, Universities, Fortune 500s, Government Labs, Small/Medium Government Labs, Small/Medium

Businesses, that use CondorBusinesses, that use Condor

Page 7: Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.

Users like Condor because...Users like Condor because...It’s open, it works, flexible, It’s open, it works, flexible, (corporations) no lock-in (corporations) no lock-in

API/Operating System, and...API/Operating System, and...

Page 8: Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.

The CommunityThe Community

Page 9: Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.

Today, let’s talk about Today, let’s talk about a few challenges, solutionsa few challenges, solutions

Page 10: Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.

War Story #1: War Story #1: Compute & DataCompute & Data

Page 11: Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.

Whenever you find or solveWhenever you find or solvea computation problem, youa computation problem, you

discover a data problem.discover a data problem.

Page 12: Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.

““Dark” or Latent, Unused StorageDark” or Latent, Unused Storageon any OS/Deviceon any OS/Device

Page 13: Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.

Empty space dispersed across Empty space dispersed across machines in unusable sizesmachines in unusable sizes

Page 14: Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.

““We need more filer space, but we We need more filer space, but we have empty space on all our have empty space on all our

machines.”machines.”

Page 15: Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.

So we looked at HadoopSo we looked at Hadoop

Page 16: Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.

New type of storage:New type of storage:Aggregated or “Cloud” StorageAggregated or “Cloud” Storage

Page 17: Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.

Block Store ArchitectureBlock Store Architecture

Page 18: Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.

But how do we use it?But how do we use it?

Page 19: Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.

1.5 years ago: It works well 1.5 years ago: It works well to access it in Java, but what to access it in Java, but what

about mounting?about mounting?

Page 20: Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.

So we tried WebDAVSo we tried WebDAV

Page 21: Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.

Next up,Next up,open source FUSE driveropen source FUSE driver

Page 22: Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.

Need: Windows/Linux, Reliable, Large Need: Windows/Linux, Reliable, Large Files, scalable, and Read/WriteFiles, scalable, and Read/Write

Page 23: Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.
Page 24: Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.

Mountable drivers Mountable drivers Linux(FUSE) / Windows (IFS)Linux(FUSE) / Windows (IFS)

Page 25: Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.

CloudFS ArchitectureCloudFS Architecture

Page 26: Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.

When we rolled it out...When we rolled it out...

Page 27: Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.

Customers Asked for Customers Asked for Surprising FeaturesSurprising Features

HTTP/REST Protocols similar to Amazon S3HTTP/REST Protocols similar to Amazon S3Reasons: Reasons:

Installing mountable driver across Installing mountable driver across servers/workstations prohibitiveservers/workstations prohibitive

Want similar interface to various cloud storage Want similar interface to various cloud storage providers => Internal Cloudproviders => Internal Cloud

FTP Interface – Because it is simple!FTP Interface – Because it is simple!

Page 28: Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.

Status TodayStatus Today

Page 29: Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.

Mountable Multi-platform Drivers. Mountable Multi-platform Drivers. Linux: SUSE 10, RHEL/CentOS 4&5, Linux: SUSE 10, RHEL/CentOS 4&5,

Windows 2k3 +, OSX 10.3+Windows 2k3 +, OSX 10.3+

Page 30: Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.

Encryption to avoid snooping Encryption to avoid snooping sensitive datasensitive data

Page 31: Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.

Data Nodes built on Java: Linux, Data Nodes built on Java: Linux, Windows, OSX, SolarisWindows, OSX, Solaris

Page 32: Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.

RESTful Storage Service & RESTful Storage Service & FTP interfaceFTP interface

Page 33: Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.

Management interface for Management interface for controlling storage featurescontrolling storage features

(Integrating with CycleServer)(Integrating with CycleServer)

Page 34: Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.

Looking forward to Looking forward to condor_hadoop!condor_hadoop!

Page 35: Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.

War Story #2: War Story #2: Cloud CalculationsCloud Calculations

Page 36: Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.

Condor usersCondor usersPeak vs. Median usagePeak vs. Median usage

ProblemProblem

Page 37: Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.

Need for compute power Need for compute power comes up suddenlycomes up suddenly

Page 38: Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.

Condor Users hunger for Condor Users hunger for resourcesresources

Page 39: Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.

Condor users balance Condor users balance “We need more servers for big “We need more servers for big

runs” and “Our servers are 40% runs” and “Our servers are 40% utilized”utilized”

Page 40: Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.

Many ways to solve Many ways to solve this problem using EC2this problem using EC2

Page 41: Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.

Use cases do exist for Use cases do exist for adding nodes to a local condor pooladding nodes to a local condor pool

using Amazon EC2using Amazon EC2

Page 42: Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.

We favored entire poolsWe favored entire poolsin cloudin cloud

Page 43: Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.

Data Scheduling, Data Scheduling, Performance issuesPerformance issues

Page 44: Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.

Run workflows faster using Run workflows faster using resources you could never buy...resources you could never buy...

Page 45: Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.

can test CycleServer at a scale can test CycleServer at a scale our users have and we don’tour users have and we don’t

Page 46: Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.

Need 1000 node Condor PoolNeed 1000 node Condor PoolWait 15 minutesWait 15 minutes

Page 47: Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.

Dynamic Resources => Dynamic Resources => Pool can be sized to the jobsPool can be sized to the jobs

Page 48: Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.

1 core1 core x x 1000 hrs 1000 hrs ==1000 core 1000 core x x 1 hr 1 hr = = ~$200~$200

Page 49: Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.

Sounds good, but how Sounds good, but how do we do this for a do we do this for a

Workflow like BLAST?Workflow like BLAST?

Page 50: Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.

From e-science 2008:From e-science 2008:For 64x the processorsFor 64x the processors

Hadoop Running Blast: 57xHadoop Running Blast: 57xmpiBLAST: 52.4xmpiBLAST: 52.4x

Page 51: Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.

High-CPU Amazon EC2 nodesHigh-CPU Amazon EC2 nodeshave best price/performancehave best price/performance

Page 52: Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.

Scalability: 2x CPUs = 1.9825xScalability: 2x CPUs = 1.9825x64 CPUS = 60.7x Speed-up64 CPUS = 60.7x Speed-up

Page 53: Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.

Why High Throughput leads toWhy High Throughput leads toEfficient ComputingEfficient Computing

Page 54: Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.

Another User:Another User:Worked with Varian - Worked with Varian - Mass SpectrometersMass Spectrometers

Other High-Tech Other High-Tech Lab EquipmentLab Equipment

Page 55: Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.

Problem: Coming up on Problem: Coming up on a conference, needed to run a conference, needed to run

a large simulationa large simulation

Page 56: Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.

Six WeeksSix WeeksOn an internal Condor poolOn an internal Condor pool

Page 57: Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.

Deployed a Condor poolDeployed a Condor poolin CycleCloudin CycleCloud

Page 58: Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.

Same 6-week Job Same 6-week Job

Page 59: Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.

Ran < 1 DayRan < 1 Day

Page 60: Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.

War Story #3: War Story #3: ManagementManagement

Page 61: Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.

Condor Tutorial mentionsCondor Tutorial mentions“Why use a personal Condor?”“Why use a personal Condor?”

i.e. Condor on few nodes...i.e. Condor on few nodes...

Page 62: Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.

Condor on 1 computer Condor on 1 computer Gets you policies, Gets you policies,

fault-tolerance, Etc. fault-tolerance, Etc.

Page 63: Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.

Similarly, management issues Similarly, management issues come up even on small poolscome up even on small pools

Page 64: Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.

Collaborating with U. of W. Collaborating with U. of W. MadisonMadison

Page 65: Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.

Managing Configuration Files Managing Configuration Files (our Config with Pipes CW2006)(our Config with Pipes CW2006)

Page 66: Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.

Exploring ClassAds/LogFilesExploring ClassAds/LogFilesbecomes problematicbecomes problematic

Page 67: Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.

Visualization, Reporting, etc.Visualization, Reporting, etc.

Page 68: Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.

Man-decades on development Man-decades on development of tools to assist running Condorof tools to assist running Condor

Page 69: Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.

Have demo against Madison poolHave demo against Madison poolCome see me. We’d love Come see me. We’d love

more use casesmore use cases

Page 70: Jason Stowe Condor Week 2009 April 22 nd, 2009. Coming to Condor Week since 2005. Started as a User.

Questions? Thank youQuestions? Thank you

For more information go to:For more information go to:http://www.cyclecomputing.comhttp://www.cyclecomputing.com

We constantly see opportunities for talented We constantly see opportunities for talented Condor folks, so please feel free to contact us!Condor folks, so please feel free to contact us!

Jason StoweJason Stowejstowe - cyclecomputing.comjstowe - cyclecomputing.com