Computer Facilities Production Grids Networking

23
Computer Facilities Production Grids Networking Session Summary Kors Bos, NIKHEF, Amsterdam CHEP 2007

description

CHEP 2007. Computer Facilities Production Grids Networking. Session Summary Kors Bos, NIKHEF, Amsterdam. Initial remarks. 742 slides in 37 talks in 7 sessions Impossible to talk about them all Selection is my choice Apologize for not giving enough exposure - PowerPoint PPT Presentation

Transcript of Computer Facilities Production Grids Networking

Computer FacilitiesProduction Grids

Networking

Session Summary

Kors Bos, NIKHEF, Amsterdam

CHEP 2007

Initial remarks

• 742 slides in 37 talks in 7 sessions

• Impossible to talk about them all

• Selection is my choice

• Apologize for not giving enough exposure

• All very interesting, have a look at the slides!

I. Grids

• WLCG Service Deployment (Jamie Shiers)• Security Incident Management (Markus Schulz)• OSG (Ruth Pordes)• GridPP (Jeremy Coles)

• NDGF (Michael Grønager)

Michael Grønager

NDGF slideWhat is the NDGF?

• The 7 biggest Nordic compute centers, dTier-1s, form the NDGF Tier-1

• Resources (Storage and Computing) are scattered

• Services can be centralized

• Advantages in redundancy• Especially for 24x7 data

taking

• We have build a distributed Tier-1– dCache – for storage– ARC for computing

• Interoperabel with:– ALICE– ATLAS– ARC monitoring and

accounting– LCG monitoring and

accounting

• It works !

II. Storage Systems

• PetaCache (Richard Mount)• Castor2 (Giuseppe Lo Presti)• gStore (Horst Göringer)• UltraLight (Paul for Rick for Jan for Harvey)• dCache vs xrootd (Alessandra Forti)

1

10

100

1,000

10,000

100,000

1,000,000

10,000,000

100,000,000

1,000,000,000

10,000,000,000

100,000,000,000

1,000,000,000,000

1983

1984

1985

1986

1987

1988

1989

1990

1991

1992

1993

1994

1995

1996

1997

1998

1999

2000

2001

2002

2003

2004

2005

2006

2007

Farm CPU box KSi2000 per$MDoubling in 1.1 or 1.9 years

Raid Disk GB/$M

Doubling in 1.37 or 1.46 years

Transatlantic WAN kB/s per$M/yrDoubling in 8.4. 0.6 or 0.9years Disk Access/s per $M

Disk MB/s per $M

Flash Accesses/s per $M

Flash MB/s per $M

DRAM Accesses/s per $M

DRAM MB/s per $M

Price/Performance Evolution: My Experience

Richard Mount from SLAC

Client Application

OS

TCP Stack

NIC

OS

TCP Stack

NIC

NetworkSwitches

OS

File System

Disk

Memory

Xrootd Data Server Xrootd Data-Server-Client

0

100

200

300

400

500

600

700

800

900

0 1 2 3 4 5 6 7 8 9 10

Total

Count of Disk/Memory Time Ratio

Disk/Memory Time Ratio

DRAM-based Protoype

ATLAS AOD Analysis

• Significant, but not revolutionary, benefits for high-load sequential data analysis – as expected.

• Revolutionary benefits expected for pointer-based data analysis – but not yet tested.

• TAG database driven analysis?

III. Tools and Services

• CMS Challenges (Ian Fisk)

• Quattor (Michel Jouvin)

• GGUS (Torsten)

• SAM (Antonio Retico)

8Antonio Retico

SAM (simplified) architecture

Oracle DB

Site information

SubmissionFramework

Grid Operationtools

Sam DisplaysSAM Displays

Other Tools

ws ws

ws

ws

All the four LHC experiments are running (or planning to run) custom tests using the production instance of SAM

Goal: sanity checks against selected grid and application services.

CMS, Alice, LHCb running custom tests in production using two different submission approaches

Atlas running standard tests in production using Atlas proxy.preparing to submit custom tests

The production SAM platform is supporting the four VOsOnly minor changes were needed to support Alice

IV. Production Experiences

• 3D in production on LCG (Dirk Duellmann)• ATLAS production on EGEE (Xavier Espinal)• CMS MC production Jose Calama• ATLAS production on OSG (Yuri Smirnov)• CMS production system (Dave Evans)• ATLAS T0 software suite (Luc Goosens)

PIC

IN2P3

FZK

RAL

CNAFCERNSARA

ASGC

TRIUMF

Computing in High Energy and Nuclear Physics (CHEP) Victoria, BC, Canada 2-7 September 2007

The ATLAS Production System

Xavier Espinal

•Production system could cope with production ramp up challenge (Nov06-Mar07)

–60M events produced in first quarter of 2007 (coexistent with the disk shortage)

•Job and WCT efficiencies has been improving almost continuously:

–~90% efficiency in WCT and ~60% for job efficiency

•Monitoring pages are the key for successful job monitoring:

•Next phase is to start automation at higher level.

•Operations group and shift system has demonstrated to be extremely necessary.

ATLAS MC Productionon OSG with PanDA

Yuri Smirnov

• OSG production has achieved robust operations model

• PanDA works well

• Overall, progressing well towards full readiness for LHC data

Monte Carlo production in the WLCG Computing Grid

José M. Hernández

Tier-0Online system

tape RAW,RECOAOD

First passreconstruction

O(50) primary datasetsO(10) streams (RAW)

Tier-1

Tier-1

Scheduled data processing (skim& reprocessing)

tape

RAWRECOAOD

RECO, AOD

Tier-2

Tier-2

Tier-2

Tier-2• Analysis• MC simulation

Reached scale of more than 20K jobs and 65 Mevt/month with an average job efficiency of about 75% and resource occupation ~ 50%

Production is still manpower intensive. New components being integrated to further improve automation, scale and efficient use of available resources

while reducing required manpower to run the system

3D Service Architecture

Production Experience with Distributed Deployment of Databases for the LHC

Computing GridDirk Duellmann

The LCG 3D project has setup a wold-wide distributed database infrastructure for LHC

Close collaboration between LHC experiments and LCG sites

with some 100 DB nodes at CERN + several tens of nodes at Tier 1 sites this is one of the largest distributed database deployments world-wide

V. Networking(Matt Crawford’s session)

• Use of alternative path WAN circuits

• Lambda Station

• IPv6

• perfSonar

Making the E2E circuit routing work

• Define high impact traffic flows:– Minimal-size source/dest. netblock pairs

• US-CMS Tier-1 / CERN T-0 address pairs follow LHCOPN circuit path (purple)

• Other FNAL-CERN traffic on routed path (blue)

• Establish E2E circuits on alternate path border router– BGP peer across VLAN-based circuits,

advertising only source netblock

• Policy route internally on source/dest pairs• Inbound routing depends on policies of

remote end– Prefer comparable PBR for symmetry

– But implement inbound PBR locally

• End-to-end circuits have proven to be useful at FNAL– Especially for LHC/CMS high impact data movement– In some cases, useful for other experiments & projects

as well

IPv6 Promises

•More addresses

•Better security

•Manageable routing tables

•Better QoS

•True mobility

?

?

But When?

•IANA’s last block of addresses is estimated to go 19 Mar 2010.

•Regional registries’ last blocks: 10 Oct 2010 – “10/10/10.”

Will IPv4 end then? Of course not.

VI. Operations and Storage • EELA (Lucas Nellen)

• ATLAS DDM (Alexei Klimentov)

• CMS Comp.Operations (Daniele Bonacorsi)

• Storage @CNAF (Luca dell’Agnello)

• Castor2 Operations (Miguel dos Santos)

• SRMv2.2 in dCache (Timor Perelmutov)

VII. Other things• SunGrid for STAR (Jérôme Lauret)

• gLExec (Igor Sfiligoi)

• gLite WMS (Simone Campana)

• CE and SE in Xen (Sergey Chechelnitskiy)

• dCache @BNL (Zhenping Liu)

CHEP'07 - Sep 6th, 2007

Pilot Security with gLExec - by I. Sfiligoi

19

Pilots and the Grid (3)

Grid Site n

Grid Site 1

Grid site i

LocalQueue

LocalQueue

VOQueue

VO pilotfactory

Gatekeepers know only about the pilots Pilots decide

what jobs to run

Pilots and all usersrun sharing the samelocal identity

Users can steal or corrupt each other's data,

if running on the same node

Any user can impersonate the pilot job

Grid Site 1

Grid site i

LocalQueue

LocalQueue

VOQueue

VO pilotfactory

Sites decide who to let run

Pilots and users run indifferent sandboxes

Like having a gatekeeper on every WN

VO still in chargeof job scheduling

• Pilots are becoming a reality– But are introducing new problems

• Yesterday's Grid security mechanisms are not enough anymore– Site admins want fine grained control– Pilot owners want OS level protection

• gLExec can help solve the technical problems

a

Simone Campana

  CMS ATLAS

Performance

200750K jobs/day 20K production jobs/day +

analysis load

2008

200K jobs/day (120K to EGEE, 80K to OSG)

Using <10 WMS entry points

100K jobs/day through the WMS;

 Using <10 WMS entry points

Stability

  <1 restart of WMS or LB every month under load

gLite WMS

• 115000 jobs submitted in 7 days– ~16000 jobs/day well exceeding

acceptance criteria• All jobs were processed normally but for

320• The WMS dispatched jobs to computing

elements with no noticeable delay

• Acceptance tests were passed

• The WMS dispatched jobs to computing elements with no noticeable delay

• Acceptance tests were passed

• 115000 jobs submitted in 7 days– ~16000 jobs/day well exceeding

acceptance criteria• All jobs were processed normally but for

320

• CMS supports submission of analysis jobs via WMS

– Using two WMS instances at CERN with the latest certified release

– For CSA07 the goal is to submit at least 50000 jobs/day via WMS

– The Job Robot (a load generator simulating analysis jobs) is successfully submitting more than 20000 jobs/day to two WMS

EndOf the

run

Expected max rate

ATLAS M4 Cosmics Run

Troughput in MB/s from T0 to all T1’sTracks in the muon chambersAnd in the TRT

Analysis done simulateouslyIn European and US T1/2 sitesThis last week of August ATLAS has shown to

master for the first time the whole data chain: from a measurement of a real cosmic ray muon in the detector until an almost real-time analysis in sites in Europe and the US with all steps in

between.

Final Remarks

• (too?) many WLCG papers

• Other grids: SunGrid, EELA

• 3 ATLAS, 3 CMS presentations

• No Alice or LHCb, 1 STAR, no Tevatron

• Much attention for services

• Storage is HOT !

Thank You !