INFN-T1 site report Andrea Chierici On behalf of INFN-T1 staff HEPiX Fall 2013.

22
INFN-T1 site report Andrea Chierici On behalf of INFN-T1 staff HEPiX Fall 2013

Transcript of INFN-T1 site report Andrea Chierici On behalf of INFN-T1 staff HEPiX Fall 2013.

Page 1: INFN-T1 site report Andrea Chierici On behalf of INFN-T1 staff HEPiX Fall 2013.

INFN-T1 site report

Andrea ChiericiOn behalf of INFN-T1 staff

HEPiX Fall 2013

Page 2: INFN-T1 site report Andrea Chierici On behalf of INFN-T1 staff HEPiX Fall 2013.

Andrea Chierici 2

Outline

• Network• Farming• Storage• Common services

28/10/2013

Page 3: INFN-T1 site report Andrea Chierici On behalf of INFN-T1 staff HEPiX Fall 2013.

Network

Page 4: INFN-T1 site report Andrea Chierici On behalf of INFN-T1 staff HEPiX Fall 2013.

4

WAN Connectivity

NEXUSCisco7600

RALPICTRIUMPHBNLFNALTW-ASGCNDFGF

IN2P3SARA

T1 resources

LHC ONE

LHC OPN

General IPGARR Bo1

20G

b/s10Gb/s

10 Gb/s CNAF-FNALCDF (Data Preservation)

20 Gb Physical Link (2x10Gb)shared for LHCOPN and LHCONE.

10Gb/s

10 Gb/s For General IP Connectivity

General IP 20 Gb/s (Q3-Q4 2013)

LHCOPN/ONE 40 Gb/s (Q3-Q4 2013)

28/10/2013

Page 5: INFN-T1 site report Andrea Chierici On behalf of INFN-T1 staff HEPiX Fall 2013.

Andrea Chierici 5

Farming and Storage current connection modelINTERNETLHCOPN

cisco7600

bd8810nexus7018

10Gb/s

10Gb/s

Disk Servers

Farming Switch

Worker Nodes

4X1Gb/sOld resources 2009-2010

Farming Switch

20 Worker Nodes per switch

2x10Gb/sUp to 4x10Gb/s

• Core switches and routers are fully redundant (power, CPU, fabrics)• Every Switch is connected with load sharing on different port modules• Core switches and routers have a strict SLA (next solar day) for maintenance

28/10/2013

Page 6: INFN-T1 site report Andrea Chierici On behalf of INFN-T1 staff HEPiX Fall 2013.

Farming

Page 7: INFN-T1 site report Andrea Chierici On behalf of INFN-T1 staff HEPiX Fall 2013.

Andrea Chierici 7

Computing resources

• 195K HS-06– 17K job slots

• 2013 tender installed in summer– AMD CPUs, 16 job slots

• Upgraded whole farm to SL6– Per-VO and per-Node approach– Some CEs upgraded and serving only some VOs

• Older nehalem nodes got a significant boost switching to SL6 (and activating hyperthreading too…)

28/10/2013

Page 8: INFN-T1 site report Andrea Chierici On behalf of INFN-T1 staff HEPiX Fall 2013.

Andrea Chierici 8

New CPU tender• 2014 tender delayed until beginning of 2014

– Will probably cover also 2015 needs• Taking into account TCO (energy consumption) not only sales price• 10 Gbit WN connectivity

– 5 MB/s per job (minimum) required– 1 gbit link is not enough to face the traffic generated by modern

multi core CPUs• Network bonding is hard to configure

• Blade servers are attractive– Cheaper 10 gbit network infrastructure – Cooling optimization – OPEX reduction– BUT: higher street price

28/10/2013

Page 9: INFN-T1 site report Andrea Chierici On behalf of INFN-T1 staff HEPiX Fall 2013.

Andrea Chierici 9

Monitoring & Accounting (1)

• Rewritten our local resource accounting and monitoring portal

• Old system was completely home-made– Monitoring and accounting were separate things– Adding/removing queues on LSF meant editing

lines in monitoring system code– Hard to maintain: >4000 lines of Perl code

28/10/2013

Page 10: INFN-T1 site report Andrea Chierici On behalf of INFN-T1 staff HEPiX Fall 2013.

Andrea Chierici 10

Monitoring & Accounting (2)

• New system: monitoring and accounting share same data base

• Scalable and based on open source software (+ few python lines)

• Graphite (http://graphite.readthedocs.org)– Time series oriented data base – Django Webapp to plot on-demand graphs– lsfmonacct module released on github

• Automatic queue management28/10/2013

Page 11: INFN-T1 site report Andrea Chierici On behalf of INFN-T1 staff HEPiX Fall 2013.

Andrea Chierici 11

Monitoring & Accounting (3)

28/10/2013

Page 12: INFN-T1 site report Andrea Chierici On behalf of INFN-T1 staff HEPiX Fall 2013.

Andrea Chierici 12

Monitoring & Accounting (4)

28/10/2013

Page 13: INFN-T1 site report Andrea Chierici On behalf of INFN-T1 staff HEPiX Fall 2013.

Andrea Chierici 13

Issues

• Grid accounting problems starting from April 2013– Subtle bugs affecting the log parsing stage on the

CEs (DGAS urcollector) and causing it to skip data• WNODeS issue upgrading to SL6– Code maturity problems: addressed quickly– Now ready for production• Babar and CDF will be using it rather soon• Potentially the whole farm can be used with WNODeS

28/10/2013

Page 14: INFN-T1 site report Andrea Chierici On behalf of INFN-T1 staff HEPiX Fall 2013.

Andrea Chierici 14

New activities

• Investigation on Grid Engine as an alternative batch system ongoing

• Testing zabbix as a platform for monitoring computing resources– Possible alternative to nagios + lemon

• WNs dynamic update to deal mainly with kernel/cvmfs/gpfs upgrades

• Evaluating APEL as an alternative to DGAS for grid accounting system

28/10/2013

Page 15: INFN-T1 site report Andrea Chierici On behalf of INFN-T1 staff HEPiX Fall 2013.

Storage

Page 16: INFN-T1 site report Andrea Chierici On behalf of INFN-T1 staff HEPiX Fall 2013.

Andrea Chierici 16

Storage Resources• Disk Space: 15.3 PB-N (net) on-line

– 7 EMC2 CX3-80 + 1 EMC2 CX4-960 (~2 PB) + 100 servers (2x1 gbps connections)– 7 DDN S2A 9950 + 1 DDN SFA 10K + 1 DDN SFA 12K(~11.3PB) + ~80 servers (10 gbps)– Installation of the latest system (DDN SFA 12K 1.9 PB-N) was completed this summer

• ~1.8 PB-N expansion foreseen before Christmas break– Aggregate bandwidth: 70 GB/s

• Tape library SL8500 ~16 PB on line with 20 T10KB drives and 13 T10KC drives (3 additional drives were added during summer 2013) – 8800 x 1 TB tape capacity, ~ 100MB/s of bandwidth for each drive– 1200 x 5 TB tape capacity, ~ 200MB/s of bandwidth for each drive– Drives interconnected to library and servers via dedicated SAN (TAN). 13 Tivoli Storage

manager HSM nodes access the shared drives– 1 Tivoli Storage Manager (TSM) server common to all GEMSS instances

• A tender for additional 470 x 5TB tape capacity is under way• All storage systems and disk-servers on SAN (4Gb/s or 8Gb/s)

28/10/2013

Page 17: INFN-T1 site report Andrea Chierici On behalf of INFN-T1 staff HEPiX Fall 2013.

Andrea Chierici 17

Storage Configuration

• All disk space is partitioned in ~10 GPFS clusters served by ~170 servers– One cluster per each main experiment (LHC)– GPFS deployed on the SAN implements a full HA system– System scalable to tens of PBs and able to serve thousands of

concurrent processes with an aggregate bandwidth of tens of GB/s

• GPFS coupled with TSM offers a complete HSM solution: GEMSS

• Access to storage granted through standard interfaces (posix, srm, xrootd and soon webdav)– FS directly mounted on WNs

28/10/2013

Page 18: INFN-T1 site report Andrea Chierici On behalf of INFN-T1 staff HEPiX Fall 2013.

Andrea Chierici 18

Storage research activities

• Studies on more flexible and user-friendly methods for accessing storage over WAN– Storage federation implementation – cloud-like approach

• We developed an integration between GEMSS Storage System and Xrootd in order to match the requirements of CMS and ALICE, using ad-hoc Xrootd modifications– CMS modification was validated by the official Xrootd integration

build – This integration is currently in production

• Another alternative approach for storage federations, based on http/webdav (Atlas use-case), is under investigation

28/10/2013

Page 19: INFN-T1 site report Andrea Chierici On behalf of INFN-T1 staff HEPiX Fall 2013.

Andrea Chierici 19

LTDP• Long Term Data preservation (LTDP) for CDF experiment

– FNAL-CNAF Data Copy Mechanism is completed

• Copy of the data will follow this timetable:– end 2013 - early 2014 → All data and MC user level n-tuples (2.1 PB)– mid 2014 → All raw data (1.9 PB) + Databases

• Bandwidth of 10 Gb/s reserved on transatlantic Link CNAF ↔ FNAL

• “code preservation” issue to be addressed

28/10/2013

Page 20: INFN-T1 site report Andrea Chierici On behalf of INFN-T1 staff HEPiX Fall 2013.

Common services

Page 21: INFN-T1 site report Andrea Chierici On behalf of INFN-T1 staff HEPiX Fall 2013.

Andrea Chierici 21

Installation and configuration tools

• Currently Quattor is the tool used at INFN-T1 • Investigation done on an alternative installation and

management tool (study carried on by storage group)• Integration between two tools:– Cobbler, for installation phase– Puppet, for server provisioning and management operations

• Results of investigation demonstrate Cobbler + Puppet as a viable and valid alternative– currently used within CNAF OpenLAB

28/10/2013

Page 22: INFN-T1 site report Andrea Chierici On behalf of INFN-T1 staff HEPiX Fall 2013.

Andrea Chierici 22

Grid Middleware status

• EMI-3 update status– Argus, BDII, Cream CE, UI, WN, Storm– Some UIs still at SL5 (will be upgraded soon)

• EMI-1 phasing-out (only FTS remains)

• VOBOX updated to WLCG release

28/10/2013