Evolution of the Atlas data and computing model for a Tier-2 in the EGI infrastructure
description
Transcript of Evolution of the Atlas data and computing model for a Tier-2 in the EGI infrastructure
II IFICÁlvaro Fernández Casaní ISGC2012, Taipei, 2th March 2012
Evolution of the Atlas data and computing model for a Tier-2 in
the EGI infrastructureÁlvaro Fernández Casaní
on behalf of the IFIC Atlas computing group
II PCI2010 WorkshopValencia, 10th-12th January 2012
1
II IFICÁlvaro Fernández Casaní ISGC2012, Taipei, 2th March 2012
Outline• Introduction: IFIC and Atlas Tier2 SPAIN• Evolution of the Atlas computing model
– Flattening the model to a Mesh– Tier 2 duties– Data distribution: Tier2s policy
• Data shares ( site classification) • Availability and Connectivity• Networking
– Dynamic data distribution and caching– Remote data access
• Real Users example 2
II IFICÁlvaro Fernández Casaní ISGC2012, Taipei, 2th March 2012
ATLAS CENTERS
3
http://dashb-earth.cern.ch/dashboard/dashb-earth-atlas.kmz
II IFICÁlvaro Fernández Casaní ISGC2012, Taipei, 2th March 2012 4
Introduction
II IFICÁlvaro Fernández Casaní ISGC2012, Taipei, 2th March 2012 5
SPAIN CONTRIBUTION TO ATLAS
II IFICÁlvaro Fernández Casaní ISGC2012, Taipei, 2th March 2012
Spanish ATLAS Tier2
6
II IFICÁlvaro Fernández Casaní ISGC2012, Taipei, 2th March 2012
IFIC Computing Infrastructure Resources
7
EGI
CSIC resources (not experiment resources):25% IFIC users25% CSIC25% European Grid25% Iberian Grid
To migrate Scientific application to the GRID
ATLAS
II IFICÁlvaro Fernández Casaní ISGC2012, Taipei, 2th March 2012
Last year summary
8
• 3,579,606 Jobs• 6,038,754 CPU
consumption hours• 13,776,655 KSi2K
(CPU time normalised)
• Supporting 22 Virtual Organizations (VOs)
II IFICÁlvaro Fernández Casaní ISGC2012, Taipei, 2th March 2012
Prevision 2012For IFIC
9
• For 2012:– We already fulfil the CPU requirements– Increasing Disk:
• 230 TB => 4 SuperMicro x 57.6 (2 TB disks)
YEAR 2010 2011 2012 2013CPU(HS06) 6000 6950 6650 7223
DISK(TB) 500 940 1175 1325
II IFICÁlvaro Fernández Casaní ISGC2012, Taipei, 2th March 2012
EVOLUTION OF THE ATLAS COMPUTING MODEL
10
II IFICÁlvaro Fernández Casaní ISGC2012, Taipei, 2th March 2012
Previous ATLAS cloud model
• Hierarchical Model based on the Monarc network topology
• CLOUDS OF Tier-1 with its geographical related Tier-2
• Possible communications:• T0-T1• T1-T1• Intra-cloud T1-T2
• Forbidden communications:• Inter-cloud T1-T2• Inter-cloud T2-T2
Simone Campana Software & Computing Workshop (April’11)https://indico.cern.ch/conferenceDisplay.py?confId=119169
II IFICÁlvaro Fernández Casaní ISGC2012, Taipei, 2th March 2012
Shortcomings of Cloud Boundaries• Consolidation of User Analysis outputs problematic
– Analysis runs over many cloud– Consolidation needs to “hop” the data through the T1s
• MonteCarlo production must confine one task to one cloud– To facilitate the output aggregation at T1
• Replication of datasets (PD2P) more inflexible– Need to replicate from T1 to T2s of the same cloud– Or “hop” through a T1
• T2s can not really be used as storage of “primary” data– Issues in creating secondary copies at other T1s
• Tier-2 limited usability due to T1 downtimes– Dependency on LFC data catalog
More info: Simone Campana Software & Computing Workshop (April’11)https://indico.cern.ch/conferenceDisplay.py?confId=119169
DATA FLOWS
TOO STRICT
OPERATIONAL PROBLEMS
II IFICÁlvaro Fernández Casaní ISGC2012, Taipei, 2th March 2012
Solving issues
13
LFC Spanish Cloud planned
this week !
• Make Tier-2 activities more independent• Reduce service dependencies:
– Move Catalog (LFC) to CERN:• Backup at US . Another at UK • Remove from tier-1• Analysis jobs will be able to run during T1 downtime• Production jobs can keep running during T1 downtime
• Benefit from technology improvements:– The network model today does not really resemble the Monarc
model• Many T2s are connected very well with many T1s• Many T2s are not that well connected with their T1
– So it makes sense to break cloud boundaries.
II IFICÁlvaro Fernández Casaní ISGC2012, Taipei, 2th March 2012
Flattening the model
• to break cloud boundaries.– Let DDM freely transfer from every site to every site– Inter-cloud direct transfers– Multi-cloud production
• Not quite there– Some links simply have limited bandwidth– In those cases, several hops will anyway be needed
• Defining T2Ds is an attempt to break cloud boundaries “for the cases where it makes sense”
Network is a key component to optimize the use of the storage and CPU
II IFICÁlvaro Fernández Casaní ISGC2012, Taipei, 2th March 2012
Categorization of Sites: T2D
• T2Ds are the Tier-2 sites that can directly transfer any size of files (especially large files) not only from/to the T1 of the cloud (to which they are associated to) but also from/to the other T1s of the other clouds.
• T2Ds are candidates for – multi-cloud production sites – primary replica repository sites (T2PRR)
II IFICÁlvaro Fernández Casaní ISGC2012, Taipei, 2th March 2012
ATLAS Tier2 Share Revision• In 2010, many Tier2s got full; T2 data distribution share revision• The data distribution to Tier-2 (and Tier-3g) sites should take into account • the network connectivity of the sites (thus, T2Ds should have more data) • the availability for analysis. • Introduced at Software & Computing Workshop 20 July 2011
– Up to ADC Ops to define the distribution policy among T2s: Preference to • Reliable sites for analysis (fraction of time with analysis queue online) • Well connected sites (T2Ds) to transfer datasets quickly
– Started from summer 2011 • Based on T2D list defined on 1st July 2011
– Simplified with 4 groups treated equally : • Alpha (60% share-17 sites) : T2Ds with > 90 % reliability (60%) • Bravo (30 % share-21 sites) : non-T2Ds with > 90% reliability • Charlie (10 % share-12 sites) : Any T2 with 80%<reliability<90% • Delta (0% share-13 sites): Any T2 with reliability < 80 %
– one single list both for pre-placement and dynamic data
16
II IFICÁlvaro Fernández Casaní ISGC2012, Taipei, 2th March 2012
ATLAS Tier2
•
17
II IFICÁlvaro Fernández Casaní ISGC2012, Taipei, 2th March 2012
Spanish Cloud Data shares Last Month
http://atladcops.cern.ch:8000/drmon/crmon_TiersInfo.html
18
IFIC
IFIC is alpha T2D site -> also candidates for more
datasets in PD2P ( see later)
II IFICÁlvaro Fernández Casaní ISGC2012, Taipei, 2th March 2012
Availability
19
http://dashb-atlas-ssb.cern.ch/dashboard/request.py/siteviewhistorywithstatistics?columnid=564&view=shifter%20view
LAST MONTH
Hammercloud: for ATLAS• Distributed Analysis testing system• For avoiding jobs go to problematic sites• Can exclude sites if test jobs are not passed
EGI has own different availability tools:• Based on ops VO• Can have conflicting results:
Last month IFIC was ATLAS ALPHA SITE ( >90 % availability)But had just 64% ops egi availability due to a config issue
II IFICÁlvaro Fernández Casaní ISGC2012, Taipei, 2th March 2012
Connectivity• Inter-cloud direct transfers
20
Transfer monitoring
II IFICÁlvaro Fernández Casaní ISGC2012, Taipei, 2th March 2012
Services
• User Interfaces:– UI00: (LVSDR)
• UI04: SL 5 64 bits• UI05: SL 4 64 bits• UI06: SL 5 64 bits
• Computing Elements (cream-ce) and Worker nodes :– WN (CE02): SL 5 64 bits Glite 3.2 with MPI
and shared home in Lustre– WN (CE03): SL 5 64 bits Glite 3.2 (puppet) – WN (CE05): SL 5 64 bits Glite 3.2 (puppet)
21
II IFICÁlvaro Fernández Casaní ISGC2012, Taipei, 2th March 2012
Job Distribution at Tier-2• Tier 2 usage in data processing
– More job slots used at Tier2s than at Tier1• Large part of MC production and analysis done at Tier2s
• More ‘digi+reco’ jobs and “group production” at Tier-2s– less weight at Tier-1s
• Production shares to be implemented to limit “group production” jobs at Tier-1, and run at Tier-2s
• Analysis share reduced at Tier-1s
22
PIC TIER-1 – 2012
More Analisys jobs at IFIC TIER 2
II IFICÁlvaro Fernández Casaní ISGC2012, Taipei, 2th March 2012
Datadisk in tier2
• Tier 2 usage disk space– T2_datadisk ≈ T1_datadisk in volume– T2_datadisk ≈ input for data processing (secondary
replicas)
23
http://bourricot.cern.ch/dq2/accounting/atlas_stats/0/
T2_datadisk
T1_datadisk
20PB
Total Capacity (Usable/Margin)
primary
February ‘12
secondary
Used (according SRM)
Used (according dq2)
II IFICÁlvaro Fernández Casaní ISGC2012, Taipei, 2th March 2012
For Spanish Sites
24
http://bourricot.cern.ch/dq2/accounting/t2_spacetoken_view/SPAINSITES/datadisk/30/
February ‘12
II IFICÁlvaro Fernández Casaní ISGC2012, Taipei, 2th March 2012
IFIC Storage resources
• Based in SUN:– X4500 + X4540– Lustre v1.8
– New Disk servers:• SuperMicro, SAS disks, capacity 2 TB
per disk and 10 Gbe connectivity
– Srmv2 (STORM)– 3 x gridftp servers
25
II IFICÁlvaro Fernández Casaní ISGC2012, Taipei, 2th March 2012
Storage resources
26
Pledges
II IFICÁlvaro Fernández Casaní ISGC2012, Taipei, 2th March 2012
Storage resources
27
II IFICÁlvaro Fernández Casaní ISGC2012, Taipei, 2th March 2012
IFIC Lustre Filesystems and Pools
• With the Lustre release 1.8.1, we have added pool capabilities to the installation.• Allows us to partition the HW inside a given Filesystem
– Better data management– Assign determined OSTs to a application/group of users– Can separate heterogeneous disks in the future
CVMFS FOR ATLAS
4 Filesystems with various pools:/lustre/ific.uv.es Read Only on WNs and UI. RW on GridFTP + SRM
/lustre/ific.uv.es/sw. Software: ReadWrite on WNs, UI (ATLAS USES CVMFS)
/lustre/ific.uv.es/grid/atlas/t3 Space for T3 users: ReadWrite on WNs and UI
xxx.ific.uv.es@tcp:/homefs on /rhome type lustre. Shared Home for users and mpi applications: ReadWrite on WNs and UI
• Different T2 / T3 ATLAS pools , and separated
from other Vos• Better Management
and Performance
II IFICÁlvaro Fernández Casaní ISGC2012, Taipei, 2th March 2012
CERN CVMFS • CVMFS: A caching, http based read-only
filesystem optimised for delivering experiment software to (virtual) machines.
29
http://atlas-install.roma1.infn.it/atlas_install/usage_plots.php
II IFICÁlvaro Fernández Casaní ISGC2012, Taipei, 2th March 2012
CVMFS at IFIC (+SQUID)
30
• Installed in all our WNs and UIs since September 2011• Easy installation (only 2 configuration files)• 20 GB/repository• There is not a dedicated partition
• Using the same SQUID as frontier (sq5.ific.uv.es)• the squid server is pointing to the public replicas
(CERN, BNL, RAL) • The performance until now is so good.• We are monitoring SQUID via CACTI (SNMP)
Sept’2011Sept’2011
• Reduced job setup time• Better performance for
analysis jobs• Reduced load on Lustre
MDS
II IFICÁlvaro Fernández Casaní ISGC2012, Taipei, 2th March 2012
Storm SRM server• Access like a local file system, so it can create and control all the data available
in disk with a SRM interface.• Coordinate data transfers, real data streams are transferred with a
gridFTPserver in another physical machine.• Enforce authorization policies defined by the site and the VO.• Developed Authorization plugin to respect local file system with the
corresponding user mappings and ACL's
II IFICÁlvaro Fernández Casaní ISGC2012, Taipei, 2th March 2012
IFIC Network
Cisco 4500 – core centre infrastructure.
Cisco 6500 – scientific computing infrastructure
Data servers:• Sun with 1GB
connection. Channel bonding tests were made aggregation 2 channels
• SuperMicro with 10 Gbe WNs and GridFTP servers
with 1GB10 Gbe
• Data Network based on gigabyte ethernet. • 10GBe uplink to the backbone network
reach 1Gbit per data server
Recent Upgrade FTP servers to satisfy
requirements for alpha sites
II IFICÁlvaro Fernández Casaní ISGC2012, Taipei, 2th March 2012
Data Distribution: dynamic data distribution and caching
33
II IFICÁlvaro Fernández Casaní ISGC2012, Taipei, 2th March 2012 34
Atlas presentation – Michael Ernst – BNL – LHCONE 12 May 2011, Washington
II IFICÁlvaro Fernández Casaní ISGC2012, Taipei, 2th March 2012
PandaDynamicDataPlacement (PD2P)
• resources utilization policy evolved during First year of data taking:– “jobs go to data” “data and jobs move to the available CPU
resources”. • Also a dynamic data placement approach has been employed• Tier-1 algorithm
– Primary copy at tier-1 basend on Planned data placement– Secondary copy when data popular, with location based on
pledges• Tier-2 algoritm:
– Jobs submitted to Panda trigger PD2P– Replicates popular datasets to a Tier2 with highest weight
• More details:https://twiki.cern.ch/twiki/bin/viewauth/Atlas/PandaDynamicDataPlacement
35
II IFICÁlvaro Fernández Casaní ISGC2012, Taipei, 2th March 2012
A further step: caching
• PD2P makes data movement more dynamic and user-driven, but replication of dataset may suffer from other problems for the users:– Latency to finally access the data– Complexity of the tools
• Still works on the idea of datasets replication• Explore the possibility to access to file or subfile
dynamically without explicit replication– Xrootd allows federation of resources and to redirect the client
to the data source.– Shorten latencies by caching in the xrootd server– To work correctly, there has to be efficient event data I/O with
minimal transactions between application and storage36
II IFICÁlvaro Fernández Casaní ISGC2012, Taipei, 2th March 2012
PD2P in Es-Cloud last month
37
http://panda.cern.ch/server/pandamon/query?mode=pd2p&action=plots
• Es-Cloud Tier 2 sites getting more datasets
• Bit more that Tier-1 PicIFIC 49
ES 166
• EsCloud T2Ds = IFIC and IFAE
II IFICÁlvaro Fernández Casaní ISGC2012, Taipei, 2th March 2012
User Example
38
II IFICÁlvaro Fernández Casaní ISGC2012, Taipei, 2th March 2012
Example; Grid & Physics Analysis
39
• Distributed Computing and Data Management Tools based on GRID technologies have been used by the IFIC physicists to obtain their results
• As an example, the Boosted Top candidate presented by M. Villaplana
II IFICÁlvaro Fernández Casaní ISGC2012, Taipei, 2th March 2012
Distributed Analysis in ATLAS
• ATLAS has a specifically system for Production and Distributed Analysis (PANDA):– Including all ATLAS
requirements– Highly automated– Low manpower– Unifies the different grid
environments (EGI-Glite, OSG and EGI-ARC)
– Monitoring web pages• Reference:
http://panda.cern.ch
40
II IFICÁlvaro Fernández Casaní ISGC2012, Taipei, 2th March 2012
Distributed Analysis in ATLAS
• For ATLAS users, GRID tools have been developed:– For Data management
• Don Quijote 2 (DQ2)– Data info: name, files, sites, number,…– Download and register files on GRID,..
• ATLAS Metadata Interface (AMI)– Data info: events number, availability– For simulation: generation parameter, …
• Data Transfer Request (DaTri)– Users make request a set of data (datasets) to create replicas in other
sites (under restrictions)– For Grid jobs
• PanDa Client– Tools from PanDa team for sending jobs in a easy way for user
• Ganga (Gaudi/Athena and Grid alliance)– A job management tool for local, batch system and the grid
41
II IFICÁlvaro Fernández Casaní ISGC2012, Taipei, 2th March 2012
Tier2 and Tier3 examples from Spain
• At IFIC the Tier3 resources are being split into two parts:– Resources coupled to IFIC Tier2
• Grid environment• Use by IFIC-ATLAS users• Resources are idle, used by the
ATLAS community
– A computer farm to perform interactive analysis (proof)
• outside the grid framework
• Reference: – ATL-SOFT-PROC-2011-018
42
II IFICÁlvaro Fernández Casaní ISGC2012, Taipei, 2th March 2012
Daily user activity in Distributed Analysis
• An example of Distributed Analysis in heavy exotic particles– Input files
– Work flow:
43
II IFICÁlvaro Fernández Casaní ISGC2012, Taipei, 2th March 2012
Daily user activity in Distributed Analysis• 1) A python script is created where requirements are defined
– Application address,– Input, Output– A replica request to IFIC– Splitting
• 2) Script executed with Ganga/Panda– Grid job is sent
• 3) Job finished successfully, output files are copied in the IFIC Tier3– Easy access for the user
44
Just in two weeks, 6 users for this analysis sent:• 35728 jobs, • 64 sites,• 1032 jobs ran in T2-ES
(2.89%),• Input: 815 datasets• Output: 1270 datasets
Just in two weeks, 6 users for this analysis sent:• 35728 jobs, • 64 sites,• 1032 jobs ran in T2-ES
(2.89%),• Input: 815 datasets• Output: 1270 datasets
II IFICÁlvaro Fernández Casaní ISGC2012, Taipei, 2th March 2012
User experience
• PD2P is transparent to users, but eventually they learn that it is in action:– “a job was finally sent to a destination that did not
originally had the requested dataset” – “I check later PD2P has copied my original dataset” – “ I just realized because I used dq2 previously to check
where the dataset was”
• Another question is that user datasets are not replicated:– “We see a failure because it is not replicating our
homemade D3PDs”
45
II IFICÁlvaro Fernández Casaní ISGC2012, Taipei, 2th March 2012
Summary
46
II IFICÁlvaro Fernández Casaní ISGC2012, Taipei, 2th March 2012
BACKUP SLIDES
47
II IFICÁlvaro Fernández Casaní ISGC2012, Taipei, 2th March 2012
A TIER 2 in ATLAS
48
Main activities
Main activities
II IFICÁlvaro Fernández Casaní ISGC2012, Taipei, 2th March 2012
Spanish Tier2 Number(more info: Pepe’s slides)
• The ATLAS Spanish Tier2 (T2-ES) consists in a federation of 3 Spanish Institutions (see Jose’s talk):– IFAE-Barcelona (25%)– UAM- Madrid (25%)– IFIC-Valencia (50%, coordinator)
• The T2-ES represents 5% of the ATLAS resources (between 60-70 T2s):
• References: – J. Phys. Conf. Ser. 219 072046
49
II IFICÁlvaro Fernández Casaní ISGC2012, Taipei, 2th March 2012
Summary
50
II IFICÁlvaro Fernández Casaní ISGC2012, Taipei, 2th March 2012
Summary
51
II IFICÁlvaro Fernández Casaní ISGC2012, Taipei, 2th March 2012
Migration of Scientific Applications to Grid at IFC
52
II IFICÁlvaro Fernández Casaní ISGC2012, Taipei, 2th March 2012
Migration of Scientific Applications to Grid at IFC
53
II IFICÁlvaro Fernández Casaní ISGC2012, Taipei, 2th March 2012
LHCONE for ATLAS
54
II IFICÁlvaro Fernández Casaní ISGC2012, Taipei, 2th March 2012
NETWORK• T2 network
55
II IFICÁlvaro Fernández Casaní ISGC2012, Taipei, 2th March 2012
For Spanish Sites
56
IFAEIFAE
IFICIFIC
UAMUAM
TIER2s DATADISK-SPAIN SITESTIER2s DATADISK-SPAIN SITES
http://bourricot.cern.ch/dq2/media/T2s/T2s-DATADISK-SPAINSITES.html
II IFICÁlvaro Fernández Casaní ISGC2012, Taipei, 2th March 2012
Availability• ES-Tier2 availability for ATLAS
57
http://hammercloud.cern.ch/atlas/autoexclusion/?cloud=20&site=
II IFICÁlvaro Fernández Casaní ISGC2012, Taipei, 2th March 2012
Availability• Hammercloud:
– Distributed Analysis testing system– For avoiding jobs go to problematic sites– Can excluded sites if test jobs are not passed– Reference:
• https://twiki.cern.ch/twiki/bin/view/IT/HammerCloud• ATLAS grid tools are improving day to day
– For instance: automatic jobs for merging output files– https://twiki.cern.ch/twiki/bin/viewauth/ATLAS/AnalysisJobOutput
Merging• ATLAS users can ask to Distributed Analysis Support Team
(DAST, [email protected]):– Problems with her/his jobs– Useful for developers
• Improve the tools and services
58
II IFICÁlvaro Fernández Casaní ISGC2012, Taipei, 2th March 2012
Distributed Analysis in ATLAS
59
References: http://twiki.cern.ch/twiki/bin/viewauth/Atlas/AtlasComputing
II IFICÁlvaro Fernández Casaní ISGC2012, Taipei, 2th March 2012
Analysis Efficiency in SeptemberATLAS Tier0 + Tier1s
ANALY*_queues
60
II IFICÁlvaro Fernández Casaní ISGC2012, Taipei, 2th March 2012
Analysis Efficiency in SeptemberATLAS Tier2s (ANALY*_queues)
61