KOLKATA Tier-2@Alice Grid
ALICE GRID
&
Kolkata Tier-2
Site Name :- IN-DAE-VECC-01
& IN-DAE-VECC-02 VO :- ALICE
City:- KOLKATA Country :- INDIA
Vikas Singhal VECC, Kolkata
KOLKATA Tier-2@Alice Grid
Events at LHC
Luminosity :
1034cm-2 s-1
40 MHz – every 25 ns
20 events overlaying
KOLKATA Tier-2@Alice Grid
CMS ATLAS
LHCb CERN
Tier 0 Centre at CERN
The Grid Computing Model
Tier2
Lab a
Uni a
Lab c
Uni n
Lab m
Lab b
Uni b Uni y
Uni x
Tier3 physics department
Desktop
Germany
Tier 1
USA
UK
France
Italy
Scandinavia
CERN Tier 1
Japan CERN Tier 0
KOLKATA Tier-2@Alice Grid
ALICE computing model
APROC Taiwan France Regional Center
Italy Regional Center
Germany Regional Center
10Gb/s
Tier 1
100 - 1000 Mb/s
Tier 4
Tier2 Center
1-10 Gb/s
Tier2 Center Tier2 Center Tier2 Center Kolkata Tier 2
Institute Institute Institute Institute
Physics data cache
155/622 Mb/s
Tier 3
Tier 0
~40 Gb/s
Online System
Online Farm
CERN Computer Center
RAW data delivered by DAQ undergo
Calibration and Reconstruction which
produce for each event 3 kinds of objects:
1. ESD object 2. AOD object 3. Tag object
Further reconstruction and calibration of RAW
data will be done at Tier 1 and Tier 2.
DPD (Derived Physics Data) objects will
be Processed in Tier 3 and Tier 4.
The generation, reconstruction, storage and
distribution of Monte-Carlo simulated data
will be the main task of Tier 1 and Tier 2.
This is done in Tier-0 site.
LHC Utilization -- ALICE ALICE Setup
HMPID
Muon Arm
TRD
PHOS
PMD
ITS
TOF
TPC
Indian contribution to ALICE : PMD, Muon Arm
Size: 16 x 26 meters
Weight: 10,000 tons
KOLKATA Tier-2@Alice Grid
Total weight 10,000t
Overall diameter 16.00m
Overall length 25m
Magnetic Field 0.4Tesla
ALICE Collaboration
~ 1/2 ATLAS, CMS, ~ 2x LHCb
~1100 people
30 countries,
80 Institutes
The ALICE collaboration & detector
Vikas Singhal, VECC, INDIA
KOLKATA Tier-2@Alice Grid
Data volumes • RAW data – 2.5 PB/year
• Two distinct periods –
•p+p (~7.5 months) and
•Pb+Pb (~40 days)
• Reconstructed and simulated data
• 1.5PB – first level RAW filtering (ESDs)
• 200TB – second level RAW filtering (AODs)
• 1PB of simulated data
• User generated data ~500TB
• Total ~5 PB of data per year (without replicas)
• Replication 2x RAW, 3x ESD/AODs, 2x user files
Taken from L. Betev Slides in T1-T2 Meeting at Karlsruhe during Jan 2012
KOLKATA Tier-2@Alice Grid
Processing
• RAW data reconstruction ~10K CPU cores
• MC processing ~15K CPU cores
• User analysis ~7K CPU cores (450 distinct users)
• ~40Mio jobs per year
• ~ 1.3 job completed every second
• ½ production, ½ user jobs
• 200 Mio files per year
Taken from L. Betev Slides in T1-T2 Meeting at KIT Taken from L. Betev Slides in T1-T2 Meeting at Karlsruhe during Jan 2012
KOLKATA Tier-2@Alice Grid
KOLKATA TIER-2 @ ALICE
KOLKATA Tier-2@Alice Grid
ALICE Sites on MONALISA
Europe
Asia North America
Africa
72 active computing sites
South America
Vikas Singhal, VECC, INDIA
KOLKATA Tier-2@Alice Grid
Why Tier 2 ?
1. Tier-2 is the lowest level to be accessible by the entire
collaboration.
2. Each sub-detector of ALICE has to be associated with
minimum Tier-2 because of large volume of calibration
and simulated data.
3. PMD is one of the important sub-detectors of ALICE.
4. We are solely responsible for PMD – from conception to
commissioning.
Vikas Singhal, VECC, INDIA
KOLKATA Tier-2@Alice Grid
Vikas Singhal, VECC, INDIA
Grid Site As per WLCG & Experiment Requirement
SE
(PureXrootD)
WNs
(More and
More WNs)
Disks
(More and
More
Disks)
WMS
MyProxy
VOMS….
CREAM-CE
Site BDII
LCG-UI
KOLKATA Tier-2@Alice Grid
KOLKATA or General Site
Site BDII NFS
SERVER
Blade 64
bit Servers
With Blade
Enclosures
Disks Arrays
(More and
More Arrays)
Central
Services
WMS
MyProxy
VO-BOX
CREAM-CE
DPM
PureXrootD
XrootD
Redirector
XrootD Disk
Server
Local a
nd G
lobal N
etw
ork
/ Fib
er L
ine fro
m N
etw
ork
PBS
SERVER
DNS
SERVER
32or64bit
Servers
1U & 2U
Servers
Few
Tower
Servers
New SAN Box
Old NAS
Older NAS
Even Older DAS
UI
SERVER
Tier3
Manage
ment
Server
and
cluster
HA
SERVER Monitoring
Server
Installation,DHCP
Server etc..
Cooling, UPS
Fire Alarm,
Access Control
etc…
HP
DELL
IBM
Etc…
KOLKATA Tier-2@Alice Grid
Frontend component of Site & Installation
LCG-CE
SE
CREAM-CE
Site BDII
LCG-UI
VO-BOX
PURE XrootD
Grid middleware meta-packages installed through YUM and
configured through YAIM.
Middleware changed time to time like
GLITE EMI. (follow manual)
During Kolkata Site installation and configuration we
experienced about RPM dependencies with JAVA, Security
packages etc.
Community and mailing list helps a lot. For most of the
problem we got the solution from mailing list.
Thanks to APROC, Taiwan for helping at each stage
KOLKATA Tier-2@Alice Grid
Middleware installed on IN-DAE-VECC-02 Site
1.Installed SLC 5.8 (x86_64) operating system on x86_64 Machine.
2. Upgrading below middleware packages to EMI middleware.
glite-VOBOX
CREAM-CE (64bit)
glite-BDII
Pure XROOTD
Redirector as
Storage Element
glite-WN (64bit)
grid01.tier2-kol.res.in
gridce02.tier2-kol.res.in
dcache-server.tier2-kol.res.in
For 79 Worker Nodes (476 core)
wn045-wn123.internal.tier2-kol.res.in
Vikas Singhal, VECC, INDIA
KOLKATA Tier-2@Alice Grid
Vikas Singhal, VECC, INDIA
Backend Component of SITE
Router & Switch
2 networks, one Public
Network and another
Private network.
Domain Name Server
DNS server is critical component.
We have 2 redundant Name servers
Naamak & suchak for High Availability.
Time Server
Configured NTP protocol
Installer
Using Network installation and
Automated configuration Quattor like tools.
Storage Server
Using NFS mounted
Common shared space
PBS Server
CE & PBS batch scheduler on a
Server. Configured Firewall (through
iptables) and did NAT ing on it.
TIER-3 Cluster
Separate cluster for local users with
Interactive and non interactive nodes.
Monitoring Server
Configured MRTG (Network Traffic
Monitoring) and cluster monitoring tool.
KOLKATA Tier-2@Alice Grid
Vikas Singhal, VECC, INDIA
Doing Preventive Maintenance Once in a Year
KOLKATA Tier-2@Alice Grid
Vikas Singhal, VECC, INDIA
Kolkata TIER-2 centre logical diagram
Router
Switch
gridce02 Backup-server
wn045
wn046
wn122
wn123
Switch-1 Switch-2
Internet 300Mbps
Computing Nodes
25 Nodes Dell and Wipro Blades Cluster with 25 TB of As Tier-3
19
2.1
68.x
.x (
Sta
nd
by)
144.16.112.xx/27
130 TB Backup
grid Grid-peer gridse001
wn001
wn002
wn024
wn025
Switch-1 Switch-2
GRID-PEER Tier-3 cluster with 32 & 64 bit machine
Computing Nodes
19
2.1
68.x
.x (
Sta
nd
by)
IN-DAE-VECC-02 Site with 64 bit machine
Installer
DELL and HP
Blade Server
with Multi
Core Xeon 3.0
GHz
naamak suchak grid01
dache-server
4 – Xrootd Disk Servers Consisting of
230 TB of IBM And HP SAN
system
SINP 1Gbps Fiber Backbone
KOLKATA Tier-2@Alice Grid
Vikas Singhal, VECC, INDIA
ALICE Tier-2 Grid Started in 2002
CERN
512Kbps Ethernet Bandwidth
Operating System
› Scientific Linux 3.05
Middleware
› Alice Environment with PBS as batch system
Hardware (CPU, Disk)
› 1xDuel Xeon,4GB Compute Node
› 2xDuel Xeon,2GB WNs
› 2x80GB Disk Space
Bandwidth
› 512Kbps Shared
S. K. Pal & T. Samanta
Started in 2002.
KOLKATA Tier-2@Alice Grid
Vikas Singhal, VECC, INDIA
From 2 Core to 700 Cores
Started with
----2 Desktop Machine 2002
----2 Tower Like Servers 2003
----9 HP 1U Servers 2004
----17 Wipro 1U Servers Single Core 2006
----40 HP Blades Dual Core 2008
----8 HP Blades Quad Core 2009
----32 Dell Bladed Dual Processor Dual Core 2011
----GPU Server with Tesla 2070 with 448Cores 2012
KOLKATA Tier-2@Alice Grid
Kolkata Tier2 on Monalisa
Vikas Singhal, VECC, INDIA
2007
2011
2009
2010
KOLKATA Tier-2@Alice Grid
Vikas Singhal, VECC, INDIA
From 512MB Disk to 300TB Disk
Started with
----512MB in Desktop Machine 2002
----40GB in Tower Like Servers as DAS 2003
----400GB in HP MSA 500 2004
----2TB Wipro NAS 2006
----108TB HP EVA SAN 2008
---- 25 TB i-scsi 2009
----200TB IBM DS 5100 2011
----2TB Hard disk in GPU Server 2012
KOLKATA Tier-2@Alice Grid
Vikas Singhal, VECC, INDIA
2006
2008
2010 2012
KOLKATA Tier-2@Alice Grid
Vikas Singhal, VECC, INDIA
From 128Kbps to 1Gbps Disk
Started with
----128Kbps shared link 2002
----512Kbps 2003
----2Mbps Dedicated Link 2004
----4Mbps from Bharti 2006
----30Mbps from Reliance 2008
----100Mbps from VSNL (ERNET) 2009
----300 Mbps from NKN 2011
----Upgrading with 1Gpbs 2012
KOLKATA Tier-2@Alice Grid
Vikas Singhal, VECC, INDIA
Efficient Cooling Concept and Implementation
Hot and Cool Air is separated.
For air separation, Cold Air Containment is created.
Cold Air Containment is least accessible Area.
Cool only hardware racks, not human, walls etc.
Human intervention to Cold Aisle Containment is restricted.
All the management and monitoring of the server, storage is from outside Cold Aisle Containment.
All the power and Ethernet cables are also from outside Cold Aisle Containment.
Temperature gradient between Cold and Hot aisle is 5oC
KOLKATA Tier-2@Alice Grid
Vikas Singhal, VECC, INDIA
Kolkata Tier-2 After renovation
KOLKATA Tier-2@Alice Grid
Major Achievements
Vikas Singhal, VECC, INDIA
Consistently
more than 400
ALICE Jobs
are running
after
Commissioning
of the efficient
Cooling
Solution.
KOLKATA Tier-2@Alice Grid
Vikas Singhal, VECC, INDIA
Kolkata Tier-2
provided total 6.0K
HEP SPEC2006 CPU
and 230TB of Disk
Storage.
Achieved pledged resources
KOLKATA Tier-2@Alice Grid
1M ALICE Job completed during Last Year
Vikas Singhal, VECC, INDIA
Performance:
~1M jobs successfully completed during last
one year
Jobs c
om
ple
ted
Time ->
KOLKATA Tier-2@Alice Grid
Vikas Singhal, VECC, INDIA
Total Kolkata Tier-2 Resources
Computing Resources:-
Total :- 476 Cores
DELL Blades 32 * 8 = 256
HP Quad Core Blades 8*8= 64
HP Dual Core Blades 39 * 4 = 156
Storage :- 230TB under one HP 2U Management Server
74TB : HP EVA 6100 under 2 * 2U HP disk server
156TB : IBM DS 5100 under 2 * 1U IBM disk server
300Mbps Network speed. It will be increased upto 1Gbps
during this year.
KOLKATA Tier-2@Alice Grid
After NKN Network, Speed Increased to 300 Mbps
Vikas Singhal, VECC, INDIA
KOLKATA Tier-2@Alice Grid
Grid-Peer Tier-3 Cluster
1U Sliding LCD Monitor with 16 port KVM
Dell(TM) PowerEdge(TM) M1000e Blade Server
Chassis.
16 Number of Dell(TM) PowerEdge(TM) M610 High
Performance Intel Blade
Each blade has latest Nehalem based 2 * Intel Quad
Core E5530 Xeon 2.4GHz CPU with 8MB cache.
Each blade has16GB RAM.
Each blade has 2 * 146GB Mounted as RAID1.
Installed SLC 5.6 x86_64 OS (kernel version 2.6.18-
164.6.1.el5).
Dell™ ISCSI EqualLogic Storage
16 * 2TB SAS hard disks.
24.88TB Usable space after RAID5 and Hot Spare.
Vikas Singhal, VECC, INDIA
KOLKATA Tier-2@Alice Grid
Vikas Singhal, VECC, INDIA
Total 25 Nodes for VECC users and PMD Collaborators.
12 32bit nodes
13 64bit computing nodes
32 bit nodes are on oldest hardware procured in 2004 (slowly we
will deprecate them as High noise, power and Heat Generation.).
25 TB of Total storage.
50 + active users (across India.)
30 + active users (in VECC.)
Quota implemented.
Root, Geant3, Aliroot, Alien, Fortran etc user specific software
installed according to hardware like 32 bit and 64 bit.
Extensively used by the users, need to extend.
Grid-Peer Tier-3 Cluster cont…
KOLKATA Tier-2@Alice Grid
Vikas Singhal, VECC, INDIA
Intra-DAE Grid
EU-India Grid
Health Grid
IGCA
GARUDA Grid
Bi-product of WLCG GRID
KOLKATA Tier-2@Alice Grid
Thank
You
Vikas Singhal, VECC, INDIA
KOLKATA Tier-2@Alice Grid
Vikas Singhal, VECC, INDIA
Supporting Slides
KOLKATA Tier-2@Alice Grid
Main data types in ALICE
• ESD – run/event numbers, trigger word, primary vertex, arrays of tracks/vertices, detector info
• AOD standard – cleaned-up ESD’s, reducing the size by a factor of 5 – Can be extended on user demand with extra information
• ESD and AOD inheriting from the same base class (keep same event interface)
Raw data
Conditions
Calibration
Alignment
data
AliRoot
RECONSTRUCTION
OCDB (updated by pass0 -passN
AliEn FC
Event Summary Data Pass1 – T0
Event Summary Data Pass2 – T1
Event Summary Data PassN – T1
ESD
filtering AOD
standard Analysis
+ extra
Analysis Analysis
Vikas Singhal, VECC, INDIA
Monte Carlo
KOLKATA Tier-2@Alice Grid
Site
ALICE central
services
Job submission
Job 1 lfn1, lfn2, lfn3, lfn4
Job 2 lfn1, lfn2, lfn3, lfn4
Job 3 lfn1, lfn2, lfn3
Job 1.1 lfn1
Job 1.2 lfn2
Job 1.3 lfn3, lfn4
Job 2.1 lfn1, lfn3
Job 2.1 lfn2, lfn4
Job 3.1 lfn1, lfn3
Job 3.2 lfn2
Optimizer
AliEn CE
WMS
CE WN
Env
OK?
Die
with
grac
e
Execs
agent
Sends job
agent to
site
Yes No
Close SE’s & Software
Matchmaking
Receives work-load
Asks work-load
Retrieves
workload
Sends job result
Updates
TQ
Submits job User
ALICE Job Catalogue
VO-Box
LCG
User Job
ALICE catalogues
Registers
output
lfn guid {se’s}
lfn guid {se’s}
lfn guid {se’s}
lfn guid {se’s}
lfn guid {se’s}
ALICE File Catalogue
packman
Vikas Singhal, VECC, INDIA
KOLKATA Tier-2@Alice Grid
Xrootd architecture
Client
Redirector (Head Node)
Data Servers
open file X
A
B
C
go to C Who has file X?
Cluster Client sees all servers as xrootd data servers
All storages are on WAN
2nd open X
go to C
Redirectors
Cache file
location
Global redirector (not in picture) – intra-site storage collaboration
Vikas Singhal, VECC, INDIA
KOLKATA Tier-2@Alice Grid
Grid security (in a nutshell!)
Important to be able to identify and authorise users Possibly to enable/disable certain actions
Using X509 certificates The Grid passport, delivered by a certification authority. (IGCA for India)
For using the Grid, create short-lived “proxies” Same information as the certificate
… but only valid for the time of the action
Possibility to add “group” and “role” to a proxy Using the VOMS extensions
Allows a same person to wear different hats (e.g. normal user or production manager)
Your certificate is your passport, you should sign whenever you use it, don’t give it away! Less danger if a proxy is stolen (short lived)
Vikas Singhal, VECC, INDIA
KOLKATA Tier-2@Alice Grid
The VOBOX
The VOBOX is a WLCG service developed in 2006 to provide the experiments with a service to:
a) Run their own services.
b) In addition it also provides file system access to the experiment software area.
The concept of VOBOX is not the same for the 4 LHC experiments
a) ALICE requires the STANDARD WLCG VOBOX
Vikas Singhal, VECC, INDIA
KOLKATA Tier-2@Alice Grid
Storage strategy
WN
SE head node
xrootd (manager)
MSS
xrootd
(worker)
Disk
SR
M
xrootd
(worker)
DPM
xrootd
(worker)
Castor
SR
M
SR
M
MSS
xrootd
emulation
(worker)
dCache
SR
M
DPM, CASTOR,
dCache are LCG-
developed SEs,
xrootd is entering as
a strategic solution
Old
implement
ation
Current
version
2.1.8
Working,
but severe
limits with
multiple
clients
Vikas Singhal, VECC, INDIA
KOLKATA Tier-2@Alice Grid
What is MonALISA ?
Caltech project started in 2002 http://monalisa.caltech.edu/
Java-based set of distributed, self-describing services
Offers the infrastructure to collect any type of information
Can process it in near real time
The services can cooperate in performing the monitoring tasks
Can act as a platform for running distributed user agents
Vikas Singhal, VECC, INDIA
KOLKATA Tier-2@Alice Grid
MonALISA software components
and the connections between
them Data consumers
Multiplexing layer
Helps firewalled
endpoints connect
Registration and
discovery
JINI-Lookup Services Secure & Public
MonALISA services
Proxies
Clients
HL services
Agents
Network of
Data gathering services
Fully Distributed System with no Single Point of Failure Vikas Singhal, VECC, INDIA
KOLKATA Tier-2@Alice Grid
PROOF
Parallel ROOT Facility
Interactive parallel analysis on a local
cluster
Parallel processing of (local) data
Fast Feedback
Output handling with direct visualization
PROOF is part of ROOT
Vikas Singhal, VECC, INDIA
KOLKATA Tier-2@Alice Grid
root
Remote PROOF Cluster
Data
root
root
root
Client – Local PC
ana.C
stdout/result
node1
node2
node3
node4
ana.C
root
PROOF Schema
Data
Proof master
Proof slave
Result
Data
Result
Data
Result
Result
Vikas Singhal, VECC, INDIA
Top Related