SAM Resource Management

September 4,2001 Lee Lueking, FNAL 1

SAM Resource Management

Lee Lueking

CHEP 2001 September 3-8, 2001

Beijing China


Intro to

• SAM is Sequential Access to data via Meta-data• Project started in 1997 to handle D0’s needs for

Run II data system.• Current SAM team includes:

– Lauri Loebel-Carpenter, Lee Lueking*, Carmenita Moore, Igor Terekhov, Julie Trumbo, Sinisa Veseli, Matthew Vranicar, Stephen P. White, Victoria White*. (*project leaders)

• http://d0db.fnal.gov/sam


Overview

• Goals of Resource Management• Users, Groups and Access modes• Resources and Resource Management Strategies• Implementation

– System Configuration– Rules and Policies– Disk Cache Management– Fair Share scheduling– Resource Co-allocation

• Plans and Conclusion


Goals of Resource Management

• Implement experiment policies on prioritization and fair sharing in resource usage, by user categories (access modes, research group etc)

• Maximize throughput in terms of real work done (i.e. user jobs and not system internal jobs such as data transfers)


Groups

• Users whose datasets, processing styles and goals are largely shared.

• Defined by:– physics topics, like Higgs, Top, W/Z, B, QCD, and

New Phenomena– detector elements like calorimeter, silicon tracking,

muon, and so on– particle identification like jets, electron, muon, and tau.

• Users must be registered and it is possible for each individual to be included many groups.


Access Modes

• Storage– Data acquisition storage

– Monte Carlo data storage

– General User data storage

• Delivery– Frequently accessed data

– Cooperative access and processing

– Data file delivery on demand

– Random access event selection


Resources

• Tape mounts

• Tape volume access

• Tape drive usage

• Network throughput

• Disk cache

• Processing CPU

• Memory cache


Management Strategies

• Divide the problem into 3 tier hierarchy: Local (station), Site, Global

• Hardware Configuration: Mass Storage System (ATL) access, Network, Disk assignments.

• Establish Rules: Group allocations, Access mode priorities, Data routing paths, Type of processing, etc.

• Algorithms to combine rules


The Hierarchy of Resource Managers

Global RMSitesConnectedby WAN

StationsAnd MSS’sConnectedBy LANs

Batch queuesand disks

Site RM

Station – Local RM

Experiment Policies,

Fair Share Allocations,

Cost Metrics


Implementation


Overview of SamDatabaseServer(s)(to Central DB)

NameServer

Site or Global

ResourceManager(s)

Log server

Station 1Servers

Station 2Servers

Station 3 Servers

Station nServers

Mass Storage System(s)

SharedGlobally

Local

SharedLocally

Arrows indicateControl and data flow


The SAM Station

• Responsibilities– Cache Management

– Project (Job) Management

– Movement of data files to/from MSS or other Stations

• Consists of a set of inter-communicating servers:– Station Master Server,

– File Storage Server,

– File Stager(s),

– Project (Job) Manager(s)


Components of a SAM Station

Station &Cache

Manager

File Storage Server

File Stager(s)

Project Managers

/Consumers

eworkers

FileStorageClients

MSS orOtherStation

MSS orOtherStation

Data flow

Control

Producers/

Cache DiskTemp Disk


Station Configuration

• Disks assigned to the cache

• Batch system used

• Batch queues available

• Batch queue depth

• Processing capacity CPU and physical memory

• Mass Storage Systems available

• Inter -station transfer mechanism: BBFTP, rcp

• Disk accessibility for distributed cluster

• Network connection, bandwidth, subnet for each machine

• Security issues, access to kerberos tickets, etc.

• Waits, timeouts and retries on failure conditions


Rules and Policies• Disk cache allocated to each group• Disk cache refreshment algorithm for each group:LRU,FIFO, etc.• Minimum amount of data to deliver at a time from each tape for a project• Order files brought into the cache.• Through which station files will be routed when retrieving from a particular

Mass Storage System• Which data access activities have the highest priority• Which data storing activities have the highest priority• To which MSS’s are files stored, and to which tapes• Sharing of the resources of a station among groups• Which users belong to which groups• How many projects per group are allowed• What processing activities are allowed on each station? *• To which stations should data access and processing activities be sent? * • How should the resources of a local cluster of stations be shared among

groups?** Currently done by administrators


Station Management• Caches

– Allocations established for groups on each station.– Resources are allocated by group

• Total Size• Lock (pin) Size• Refresh algorithm: LRU,FIFO,…

– No rigid assignment to particular physical disks.

• Projects– Number of concurrent projects for each group, on each station.

• Administration is by authorized users only– Station admins– Group admins


Station Administration: Dump(1)

lueking@d0mino:~ % sam dump station –groups

*** BEGIN DUMP STATION central-analysis, id=21 running at d0mino 5 days 22 hours 24 minutes 20 seconds, admins: lueking

Known batch systems: lsf

Default batch system: lsf

No Source location is preferred

There are 1 authorized transfer groups

Full delivery unit is enforced; external deliveries are unconstrained


Station Administration: Dump (2)AUTHORIZED GROUPS:

group algo: admins: cope lueking melanson terekhov veseli white , swap policy: LRU, fair share: 0, quotas (cur/max): projects = 5/50, disk: 72838247KB/100000000KB, locks:0B/30000000KB

group cal: admins: lueking terekhov veseli white , swap policy: LRU, fair share: 0, quotas (cur/max): projects = 1/10, disk: 11856085KB/78125MB, locks:0B/78125MB

group demo: admins: lueking terekhov veseli white , swap policy: LRU, fair share: 0.608163, quotas (cur/max): projects = 2/50, disk: 4867877KB/5000000KB, locks:0B/0KB

group dzero: admins: lueking melanson terekhov veseli white , swap policy: LRU, fair share: 0.142857, quotas (cur/max): projects = 10/100, disk: 499860527KB/500000000KB, locks:0B/100000000KB

group emid: admins: lueking terekhov veseli white , swap policy: LRU, fair share: 0, quotas (cur/max): projects = 0/10, disk: 6396015KB/10000000KB, locks:0B/10000000KB

group test: admins: lueking terekhov veseli white , swap policy: LRU, fair share: 0.11512, quotas (cur/max): projects = 1/20, disk: 21381359KB/26000000KB, locks:237179KB/20000000KB

group thumbnail: admins: lueking melanson schellma , swap policy: LRU, fair share: 0.13386, quotas (cur/max): projects = 0/5, disk: 20687259KB/50000000KB, locks:0B/0KB

*** END OF STATION DUMP ***


Adding Data to the System

• Metadata descriptions for:– Detector data– Monte Carlo data– Processing details

• Mapping to storage locations (we call auto-destinations)

• Station forwarding specification


Replica

Site

WAN Data

flow

Station

Mass Storage System

User (producer)

Forwarding + Caching = Global Replication

NIKHEF(Amsterdam)155 Mbps

Sara

Fermilab

D0robot


Replica

Site

WAN Data

flow

Station

Mass Storage System

User (producer)

Routing + Caching = Global Replication


Resource Management Approaches

• Fair Sharing (policies)– Allocation of resources and scheduling of jobs– The goal is to ensure that, in a busy

environment, each abstract user gets a fixed share of “resources” or gets a fixed share of “work” done

• Co-allocation and reservation (optimization)


Fair Share and Computational Economy

• Jobs, when executed, incur costs (through resource utilization) and realize benefits (through getting work done)

• Maintain a tuple (vector) of cumulative costs/benefits for each abstract user and compare them to his allocated fair share to set priority higher/lower

• Incorporate all known resource types and benefit metrics, totally flexible


Job Control: Station Integration with the Abstract Batch System

ClientLocal RM

(Station Master)

Batch SystemProcess Manager

(SAM wrapper script)User Task

Job Manager(Project Master)

Sam submit

submit

dispatch invoke

Sam condition satisfied

resubmit

setJobCount/stop

invoke

jobEnd

1. Fair Share Job Scheduling2. Resource Co-allocation


Future Plans

• Tape mounts were a critical resource in the past, but the inter-station movement of data is perceived to be a future constraint as more stations are deployed with large disk caches.

• In addition to moving the data to computing resources, the system will evolve to move the processing to the data.

• Job control language that will specify each task at a level that will allow the system to decide when and where it can optimally be processed.

• Incorporate standard grid components as availability and need dictates: GridFTP, GSI, Condor, DAGMan, etc..


Conclusion

• The SAM system used for D0 data management and access represents a large step toward a global data grid.

• Resources are managed at station, site and global levels.

• The system is governed by station configuration and rules/policies.

• Fair share resource allocation and scheduling controls amount of work done by each group, access mode, etc.

• co-allocation coordinates data and processing to most effectively utilize the overall system.

SAM Resource Management

Documents

Transcript of SAM Resource Management