Advanced Condor mechanisms CERN Feb 14 2011
description
Transcript of Advanced Condor mechanisms CERN Feb 14 2011
Condor ProjectComputer Sciences DepartmentUniversity of Wisconsin-Madison
Advanced Condor mechanisms
CERN Feb 14 2011
2www.condorproject.org
a better title…“Condor Potpourri”
› Igor feedback “Could be useful to people, but not Monday”
› If not of interest, new topic in 1 minute
3www.condorproject.org
Central Manager Failover
› Condor Central Manager has two services
› condor_collector Now a list of collectors is supported
› condor_negotiator (matchmaker) If fails, election process, another takes
over Contributed technology from Technion
4www.condorproject.org
Submit node robustness:Job Progress continues if connection is interrupted
› Condor supports reestablishment of the connection between the submitting and executing machines. If network outage between execute and submit
machine If submit machine restarts
› To take advantage of this feature, put the following line into their job’s submit description file:
JobLeaseDuration = <N seconds>For example:
job_lease_duration = 1200
5www.condorproject.org
Submit node robustness:Job Progress continues if
submit machine failsAutomatic Schedd FailoverCondor can support a submit
machine “hot spare” If your submit machine A is down for
longer than N minutes, a second machine B can take over
Requires shared filesystem (or just DRBD*?) between machines A and B
*Distributed Replicated Block Device – www.drbd.org
6www.condorproject.org
DRBD
7www.condorproject.org
Interactive Debugging
› Why is my job still running?Is it stuck accessing a file?Is it in an infinite loop?
› condor_ssh_to_job Interactive debugging in UNIX Use ps, top, gdb, strace, lsof, … Forward ports, X, transfer files, etc.
8www.condorproject.org
condor_ssh_to_job Example
% condor_q
-- Submitter: perdita.cs.wisc.edu : <128.105.165.34:1027> : ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD 1.0 einstein 4/15 06:52 1+12:10:05 R 0 10.0 cosmos
1 jobs; 0 idle, 1 running, 0 held
% condor_ssh_to_job 1.0
Welcome to [email protected]!Your condor job is running with pid(s) 15603.
$ gdb –p 15603 …
9www.condorproject.org
How it works› ssh keys created for each invocation
› ssh Uses OpenSSH ProxyCommand to use connection
created by ssh_to_job
› sshd runs as same user id as job receives connection in inetd mode
• So nothing new listening on network• Works with CCB and shared_port
10www.condorproject.org
What?? Ssh to my worker nodes??
› Why would any sysadmin allow this?
› Because the process tree is managed Cleanup at end of job Cleanup at logout
› Can be disabled by nonbelievers
11www.condorproject.org
Concurrency Limits
› Limit job execution based on admin-defined consumable resources E.g. licenses
› Can have many different limits
› Jobs say what resources they need
› Negotiator enforces limits pool-wide
11
12www.condorproject.org
Concurrency Example
› Negotiator config file MATLAB_LIMIT = 5 NFS_LIMIT = 20
› Job submit file concurrency_limits = matlab,nfs:3
This requests 1 Matlab token and 3 NFS tokens
12
13www.condorproject.org
Green Computing
› The startd has the ability to place a machine into a low power state. (Standby, Hibernate, Soft-Off, etc.) HIBERNATE, HIBERNATE_CHECK_INTERVAL If all slots return non-zero, then the machine
can powered down via condor_power hook A final acked classad is sent to the collector
that contains wake-up information› Machines ads in “Offline State”
Stored persistently to disk Ad updated with “demand” information: if
this machine was around, would it be matched?
14www.condorproject.org
Now what?
15www.condorproject.org
condor_rooster
› Periodically wake up based on ClassAd expression (Rooster_UnHibernate)
› Throttling controls
› Hook callouts make for interesting possibilities…
20www.condorproject.org
Dynamic Slot Partitioning
› Divide slots into chunks sized for matched jobs
› Readvertise remaining resources
› Partitionable resources are cpus, memory, and disk
› See Matt Farrellee’s talk
20
21www.condorproject.org
Dynamic Partitioning Caveats
› Cannot preempt original slot or group of sub-slots Potential starvation of jobs with large
resource requirements
› Partitioning happens once per slot each negotiation cycle Scheduling of large slots may be slow
21
22www.condorproject.org
High Throughput Parallel Computing
› Parallel jobs that run on a single machine Today 8-16 cores, tomorrow 32+ cores
› Use whatever parallel software you want It ships with the job MPI, OpenMP, your own scripts Optimize for on-board memory access
23www.condorproject.org
Configuring Condor for HTPC
› Two strategies: Suspend/drain jobs to open HTPC slots Hold empty cores until HTPC slot is open
› We have a recipe for the former on the Condor Wiki http://condor-wiki.cs.wisc.edu
› User accounting enabled by Condor’s notion of “Slot Weights”
24www.condorproject.org
CPU AffinityFour core Machine
running four jobs w/o affinity
j1 j2 j3 j4
j3a j3b j3c j3d
core1 core2 core3 core4
25www.condorproject.org
CPU Affinityto the rescue
SLOT1_CPU_AFFINITY = 0SLOT2_CPU_AFFINITY = 1SLOT3_CPU_AFFINITY = 2SLOT4_CPU_AFFINITY = 3
26www.condorproject.org
Four core Machinerunning four jobs
w/affinity
j1 j2 j3 j4
j3a
j3b
j3c
j3d
core1 core2 core3 core4
27www.condorproject.org
Condor + Hadoop FS (HDFS)
Condor+HDFS = 2 + 2 = 5 !!! A Synergy exists (next slide)
• Hadoop as distributed storage system• Condor as cluster management system
Large number of distributed disks in a compute cluster
Managing disk as a resource
28www.condorproject.org
condor_hdfs daemon
› Main integration point of HDFS within Condor
› Configures HDFS cluster based on existing condor_config files
› Runs under condor_master and can be controlled by existing Condor utilities
› Publish interesting parameters to Collector e.g IP address, node type, disk activity
› Currently deployed at UW-Madison
29www.condorproject.org
Condor + HDFS : Next Steps?
› Integrate with File Transfer Mechanism
› FileNode Failover› Management of HDFS› What about HDFS in a GlideIn
environment?? › Online transparent access to
HDFS??
30www.condorproject.org
Remote I/O Socket› Job can request that the condor_starter process
on the execute machine create a Remote I/O Socket
› Used for online access of file on submit machine – without Standard Universe. Use in Vanilla, Java, …
› Libraries provided for Java and for C, e.g. :Java: FileInputStream -> ChirpInputStream
C : open() -> chirp_open()› Or use Parrot!
31www.condorproject.org
Job
Fork
startershadow
HomeFile
System
I/O Library
I/O Server I/O Proxy
Secure Remote I/O
Local System Calls
Local I/O(Chirp)
Execution SiteSubmission Site
32www.condorproject.org
33www.condorproject.org
DMTCP› Written at Northeastern U. and MIT
› User-level process checkpoint/restart library
› Fewer restrictions than Condor’s Standard Universe Handles threads and multiple processes No re-link of executable
› DMTCP and Condor Vanilla Universe integration exists via a job wrapper script
34www.condorproject.org
Questions?
Thank You!