CERN SRM Development Benjamin Coutourier Shaun de Witt CHEP06 - Mumbai.

Post on 17-Jan-2016

214 views 0 download

Transcript of CERN SRM Development Benjamin Coutourier Shaun de Witt CHEP06 - Mumbai.

CERN SRM Development

Benjamin CoutourierShaun de Witt

CHEP06 - Mumbai

Background Original version based on SRM 1.1

Specification implemented by CERN Latest version based on SRM 2.1.1

Specification Collaborative Effort

CERN (CH) RAL (UK)

Based on modified WSDL (http://sdm.lbl.gov/srm-wg/srm.v2.1.1.modified.wsdl)

Tools

Based on modified WSDL (http://sdm.lbl.gov/srm-wg/srm.v2.1.1.modified.wsdl)

Selected gsoap-2.7.2

Tools

cgsi-soap plugin Oracle (10.2.1) umbrello (

http://uml.sourceforge.net) g++ (3.2.3) valgrind

Design Objectives

Low latency Short requests handled synchronously Longer requests (involving CASTOR

stager) mostly handled asynchronously Multi-threading architecture

Robustness Asynchronous requests stored in

database

Design Objectives Interoperability

Actually a common theme with all SRMs Using common WSDL Tested CASTOR SRM with DCACHE

clients and DCACHE SRM with CASTOR clients

Robustness Load testing submitting many requests

near simultaneously – using Tier1 machines

Design

DatabaseSRMServer

SRMDaemon

CASTORStager

CASTORNameserver

SRM

CASTOR

ClientClientClientClients

Design

Significant reuse of CASTOR code dlf threadpools database services IObject model

Server Design

Thread pool default 10 threads but can be

overridden Currently no maximum, but it should

probably exist Soap backlog

default 40 messages, but can be overridden

Daemon Design

Four dedicated threads pool of threads for PUT requests pool of threads for GET requests single thread for COPY request single thread for SRM Garbage

collection Selection from database triggered

by database entry (TBC).

Data Flow Summary Directory Functions

client – server – nameserver PrepareToXXX, Copy, putDone

client – server – daemon – stager Other Data Transfer

client - server Space Management

client - server

Development Issues

gsoap Steep learning curve default namespace issues

sometimes generated ns1__, sometimes ns2__

We explicitly use srm__ API changes between minor releases

using same wsdl Meaning the generated API’s.

Development Issues Umbrello

Not as robust or well documented as similar commercial tools

Spent several days recovering from undocumented problems.

ORACLE Need matching versions of client and

sever libraries (not v9 clients and v10 servers anyway

Interoperability issues

SRM Specs do not state when/where to use status codes For a request like srmRm with

multiple files If any file succeeds, we return SUCCESS If all files fail, we return FAILURE Each file that is successful, we return

DONE Each file that fails we return FAILURE

Interoperability Issues

Explanation in return status CASTOR SRM returns empty string DPM SRM returns NULL

Type Promotion Castor only supports Permanent file

types If client requests volatile or durable –

SRM returns SUCCESS Return PERMANENT is file structure

Status By end of January

All methods implemented except Permission functions

Full regression test suite available Still to do

Permission functions VOMS integration Complete memory leak checking Thread Tuning/Signal handling/documentation

Status

Few issues with interface to CASTOR still need investigating. Some methods only log first DLF call Some APIs which accept multiple files

only return a single result.

CASTOR specific Only permanent files supported Space reservation is notional

Handled entirely within SRM with no reference to CASTOR

CASTOR storage considered semi-infinite

srmLs limits number of returns Configurable limit Set to 2048 currently

CASTOR specific Suspend/Resume not supported Dynamic space compacting not

supported Pin lifetimes are advisory

Used in weighting CASTOR garbage collection policy

Pins are applied once files are staged putDone issued or file staged.

Castor Specific

Non-static TURL Need to call status to get new TURL

srmRmdir does not support recursion