The EU DataGrid Architecture The European DataGrid Project Team [email protected].
J.J.Blaising April 02AMS DataGrid-status1 DataGrid Status J.J Blaising IN2P3 Grid Status Demo...
-
Upload
jasmine-cameron -
Category
Documents
-
view
219 -
download
2
Transcript of J.J.Blaising April 02AMS DataGrid-status1 DataGrid Status J.J Blaising IN2P3 Grid Status Demo...
J.J.Blaising April 02 AMS DataGrid-status 1
DataGrid StatusJ.J Blaising
IN2P3
Grid Status
Demo introduction
Demo
J.J.Blaising April 02 AMS DataGrid-status
Grid Technology:Introduction & Overview
Ian FosterArgonne National Laboratory
University of Chicago
J.J.Blaising April 02 AMS DataGrid-status 3
• Harvey B. Newman, Caltech• Data Analysis for Global HEP Collaborations
• LCG Launch Workshop, CERN• l3www.cern.ch/~newman/LHCCMPerspective_hbn031102.ppt
•LHC Computing Model Perspective
J.J.Blaising April 02 AMS DataGrid-status 4
• Query (task completion time) estimation• Queueing and co-scheduling strategies• Load balancing (e.g. Self Organizing Neural Network)• Error Recovery: Fallback and Redirection Strategies• Strategy for use of tapes• Extraction, transport and caching of physicists’
object-collections; Grid/Database Integration• Policy-driven strategies for resource sharing
among sites and activities; policy/capability tradeoffs• Network Peformance and Problem Handling
– Monitoring and Response to Bottlenecks– Configuration and Use of New-Technology Networks e.g.
Dynamic Wavelength Scheduling or Switching• Fault-Tolerance, Performance of the Grid Services Architecture • Consistent transaction management, …….
FROM H.Newman
NLNLSURFnet
CERN
UKUKSuperJANET4
AbileneAbilene
ESNETESNET
MRENMREN
ITITGARR-B
GEANT
NewYork
STAR-TAP
STAR-LIGHT
DataTAG project
Major 2.5 Gbps circuits between Europe & USA
J.J.Blaising April 02 AMS DataGrid-status 6
DataGrid Goal
Develop middleware to allow WAN distributed
computing and data management
• Build a distributed batch system allowing to submit
jobs on different sites with automatic site selection
according to resource matching.
Next:
• Interactive use and parallel processing
• Other OS (Solaris)
Requirements from HEP, Earth Orbservation and
Biomedical applications.
J.J.Blaising April 02 AMS DataGrid-status 7
User ITF
Node
Computing element gatekeeper
Jobmanger-PBS/LSF/BQSPublish CPU resources
Storage element gatekeeper
Publish storage resources
Worker Node
Worker Node
Client
Worker Node
Resources provider
Storage
CPU
Workload managerInformation system
FileCatalog server
Grid Services
Submit job
J.J.Blaising April 02 AMS DataGrid-status 8
Middleware status v1.1.2
•Workload manager (UI+RB+JSS+LB), WP1
still bug fixing + improvements for year 2
• Data management, file catalog, replica manager, WP2
good collaboration with globus
• Information system, WP3 deployment of uniform FTREE/MDS/R-GMA
• Fabric management, WP4
LCFG, light LCFG for preinstalled systems
• Mass storage management, WP5, Castor, Hpss, …
Successful EU review on 1 March
J.J.Blaising April 02 AMS DataGrid-status 9
VO Services
• Computing and Storage element services deployed
at CERN, CC-IN2P3, CNAF, NIKHEF, RAL, more …
US sites soon to test Grid interoperability
For ALICE, ATLAS, CMS, LHCb, Earthobs, Biomed
deployment of dedicated services
• LDAP server (certificates)
• File catalog (LFN/PFN mapping)
• GDMP server (automatic data replication)
• More to come, Metadata catalog, …
J.J.Blaising April 02 AMS DataGrid-status 10
Application activities (WP8)
•Middleware evaluation using
ALICE, ATLAS, CMS, LHCb, Gen-Hep toolkits
• User requirements collection with
ALICE, ATLAS, CMS, LHCb
• Common HEP uses cases
• Common application use case
J.J.Blaising April 02 AMS DataGrid-status 11
OS & Net services
Bag of Services (GLOBUS)
Specific application layer
GLOBUS team
MiddleWare
MW1 MW2 MW3 MW4 MW5
MiddleWare
MW1 MW2 MW3 MW4 MW5
LHCVO use cases & requirements Other apps
If we manage to define
ALICE ATLAS CMS LHCb Other apps
MiddleWare
MW1 MW2 MW3 MW4 MW5
VO use cases & requirements Common core use case
Or even better
LHC Other apps
MiddleWare
MW1 MW2 MW3 MW4 MW5
VO use cases & requirements
It will be easier to arrive at
Common use cases
LHC Other apps
Common use cases
J.J.Blaising April 02 AMS DataGrid-status 12
What we want from a GRID
• This is the result of our experience on TB0 & TB1
OS & Net services
Basic Services
High level GRID middleware
LHCVO common application layer
Other apps
ALICE ATLAS CMS LHCb
Specific application layer
Other apps
GLOBUS team
GRID architectur
e
Common use cases
J.J.Blaising April 02 AMS DataGrid-status 13
Demo introduction
•Sites involved CERN, CNAF, LYON, NIKHEF, RAL
•User interface in X, dg-job-submit demo.jdl =>
job sent to the Workload management syst at CERN
•The WMS selects a site according to resource
attributes given in the jdl file and to the resources
published via the Infornation System.
•The job is sent to one of the site, a data file is written
the file is copied to the nearest MS and replicated on
all other sites.
•dg-job-get-output is used to retrieve the files
J.J.Blaising April 02 AMS DataGrid-status 14
Add lfn/pfnto
Rep Catalog
GenerateRaw eventson local disk
Raw/dst ?
Job argumentsData Type : raw/dstRun Number :xxxxxxNumber of evts :yyyyyyNumber of wds/evt:zzzzzzRep Catalog flag : 0/1Mass Storage flag : 0/1
Write logbook
raw_xxxxxx_dat.log
dst_xxxxxx_dat.log
Read raw eventsWrite dst events
Get pfnfrom
Rep Catalog
Add lfn/pfnto
Rep Catalog
MS
MS
Move toSE, MS ?
Write logbook
pfn local ? ny
raw_xxxxxx_dat.log
Copy raw data From SE toLocal disk
Generic HEP application flowchart
SEMove to SE, MS?
SE
J.J.Blaising April 02 AMS DataGrid-status 15
demo.jdlExecutable = demo.csh;Arguments = raw 100147 50 1000 0 0StdInput = none;StdOutput = demo.out;StdError = demo.err;InputSandbox = {demo.csh,main.exe};OutputSandbox={demo.out.demo.err,demo.log};Requirements = other.OpSys==“RH 6.2;
dg-job-submit demo.jdl
dg-job-get-output job-id
User ITF
Node
Input sanboxOutput sandbox
Workload managerInformation system
STORAGE
COMPUTING
STORAGE
STORAGE
STORAGE
STORAGE
COMPUTING
COMPUTING
COMPUTING
COMPUTING
File catalogserver
data