CHEP 2000

download CHEP 2000

If you can't read please download the document

description

CHEP 2000. Smart Resource Management Software in High Energy Physics Wolfgang Gentzsch and Lothar Lippert Gridware GmbH & Inc. Padua, 9 February 2000. CHEP 2000 Resource Management with CODINE / GRD. Technical Requirements and Features. what do we offer to help HEP Computing. - PowerPoint PPT Presentation

Transcript of CHEP 2000

  • CHEP 2000Smart Resource Management Software in High Energy Physics

    Wolfgang Gentzsch and Lothar LippertGridware GmbH & Inc.

    Padua, 9 February 2000

  • Technical Requirements and Features what do we offer to help HEP ComputingCHEP 2000Resource Management with CODINE / GRDGridware - The Company Technology Leader in Resource ManagementA special offer to the HEP community Our answer to falling hardware-prices

  • Technical Requirements and Features Array Jobs Advanced Queue Concept Policy Management Separation of Components Solutions for mixing interactive and batch Simplified system administration AFS Support CORBA Interface All classic Features Availability

  • Array Jobs#!/bin/sh...1 single Submit-Command for thousands of similar jobsExample: qsub -t 1-1000:1 jobscript.sh

    creates 1000 instances of a single job The whole array can be (also partly) manipulated (deleted, suspended, ...) with 1 command unlimited number of instances

  • Job

    Advanced Queue Concept The whole cluster can be adressed Soft requests are supported No empty queues while others are more than full each host can be treated with different policies users just request resourceshigher efficiency Emergency Room ConceptJobClusterDispatchJobQ1Q2Example: qsub -l mem_free=10M jobscript.sh Cluster is split Queues may run empty users have to decide for a queue Job has to stay in line also if other resources are unusedGrocery Store ConceptExample: qsub -q 10MQ jobscript.sh

  • Policy ManagementFairshareOverride SystemBoosts temporarily project/job/group/departmentShare UtilizationTimeRaise group20%Group1 30%Group2 50%Group3

  • Separation of Master and Scheduler Scalability high performance good response time faster job placement

    Separation of Components

  • Simplified system administration No daemon restarts necessary Add machines on the fly Ability to install the entire cluster from one workstation No submit daemons or configuration needed for client Optimized architecture provides reliabilityConifiguration changes without any pain

  • What else?CORBA InterfaceAFS Support All classic FeaturesInteractive vs. Batch accounting, monitoring, suspension, sensors ... time windows automatic suspend migration, ...Availability all leading unix platforms

  • The companyGENIASChord based in Germany European Union funded projects R&D company located in California leader in sales of RMS Technology leader in Resource Management Goal: make CODINE world standard in Resource Management

  • Our experienceEU funded research projects REMUS UNICORE...Reseach & Development DESY Zeuthen (long relationship) CASPUR (recently switched to CODINE) MPI (Max Planck Institutes) ...Industry BMW SAAB SIEMENS ...

  • Contact Ushttp://www.gridware.de

    [email protected]

    +49 (0) 9401 92 00 0

    [email protected]

    Range2 - 4 CpusSoftware has to support rangesCodine supports ranges

    Restart, Job grows or keeps same amount Auf CPUsMake sure that the job does not fail:

    at start-up and monitoring while running

    Get shorter Response Times:

    waiting for scheduler, then faster job-run;overall: more jobs in the same time-period