Oxana Smirnova , Gudmund H øst , Mattias Wadenstein WLCG GDB 12/12/12 CERN
GLite, the next generation middleware for Grid computing Oxana Smirnova (Lund/CERN) Nordic Grid...
-
date post
21-Dec-2015 -
Category
Documents
-
view
215 -
download
0
Transcript of GLite, the next generation middleware for Grid computing Oxana Smirnova (Lund/CERN) Nordic Grid...
gLite, the next generation middleware for Grid computing
Oxana Smirnova (Lund/CERN)
Nordic Grid Neighborhood Meeting
Linköping, October 20, 2004
Uses material from E.Laure and F.Hemmer
2
gLite
• What is gLite: “the next generation middleware for grid
computing” “collaborative efforts of more than 80
people in 10 different academic and industrial research centers”
“Part of the EGEE project (http://www.eu-egee.org)”
“bleeding-edge, best-of-breed framework for building grid applications tapping into the power of distributed computing and storage resources across the Internet”
EGEE Activity Areas
(quoted from http://www.glite.org)
Nordic contributors: HIP, PDC, UiB
3
Architecture guiding principles
• Lightweight services Easily and quickly deployable Use existing services where possible as basis for
re-engineering “Lightweight” does not mean less services or
non- intrusiveness – it means modularity• Interoperability
Allow for multiple implementations• Performance/Scalability & Resilience/Fault
Tolerance Large-scale deployment and continuous usage
• Portability Being built on Scientific Linux and Windows
• Co-existence with deployed infrastructure Reduce requirements on participating sites Flexible service deployment Multiple services running on the same physical
machine (if possible) Co-existence with LCG-2 and OSG (US) are
essential for the EGEE Grid service• Service oriented approach
60+ external dependencies
…
4
Service-oriented approach
• By adopting the Open Grid Services Architecture, with components that are: Loosely coupled (messages) Accessible across network; modular and self-contained; clean
modes of failure Can change implementation without changing interfaces Can be developed in anticipation of new use cases
• Follow WSRF standardization No mature WSRF implementations exist to-date so start with plain
WS• WSRF compliance is not an immediate goal, but the WSRF evolution
is followed• WS-I compliance is important
5
Globus 2 based Web services based
gLite-2gLite-1LCG-2LCG-1
gLite vs LCG-2
• Intended to replace LCG-2
• Starts with existing components
• Aims to address LCG-2 shortcoming and advanced needs from applications (in particular feedback from DCs)
• Prototyping short development cycles for fast user feedback
• Initial web-services based prototypes being tested with representatives from the application groups
6
Approach
• Exploit experience and components from existing projects AliEn, VDT, EDG, LCG, and others
• Design team works out architecture and design Architecture: https://edms.cern.ch/document/476451 Design: https://edms.cern.ch/document/487871/ Feedback and guidance from EGEE PTF, EGEE NA4,
LCG GAG, LCG Operations, LCG ARDA
• Components are initially deployed on a prototype infrastructure Small scale (CERN & Univ. Wisconsin) Get user feedback on service semantics and interfaces
• After internal integration and testing components to be deployed on the pre-production service
EDGVDT . . .
LCG . . .AliEn
7
Subsystems/components
LCG2: components gLite: servicesUser Interface
AliEn
Computing Element
Worker Node
Workload Management System
Package Management
Job Provenance
Logging and Bookkeeping
Data Management
Information & Monitoring
Job Monitoring
Accounting
Site Proxy
Security
Fabric management
9
Computing Element
• Works in push or pull mode
• Site policy enforcement
• Exploit new Globus GK and Condor-C (close interaction with Globus and Condor team)
CEA … Computing Element Acceptance
JC … Job Controller
MON … Monitoring
LRMS … Local Resource Management System
10
Data Management
• Scheduled data transfers (like jobs)
• Reliable file transfer
• Site self-consistency
• SRM based storage
12
Catalogs
File Catalog
Metadata Catalog
LFN
Metadata
• File Catalog Filesystem-like view on logical file names Keeps track of sites where data is stored Conflict resolution
• Replica Catalog Keeps information at a site
• (Metadata Catalog) Attributes of files on the logical level Boundary between generic
middleware and application layer
Replica Catalog Site A
GUID SURL
SURL
LFN
Replica Catalog Site B
GUID SURL
SURL
LFN
GUIDSite ID
Site ID
13
Information and Monitoring
• R-GMA for Information system and
system monitoring Application Monitoring
• No major changes in architecture But re-engineer and harden
the system
• Co-existence and interoperability with other systems is a goal E.g. MonaLisa
MP
P – M
emory P
rimary P
roducer
DbS
P – D
atab
ase S
econdary P
roducer
Job wrapper
MPP
DbSP
Job wrapper
MPP
Job wrapper
MPP
e.g: D0 application monitoring:
14
“The Grid”“The Grid”
Joe
PseudonymityService
(optional)
CredentialStorage
1.2.
3.
4.
Obtain Grid (X.509)credentials for Joe
“Joe → Zyx”
“Issue Joe’sprivileges to Zyx”
“User=Zyx Issuer=Pseudo CA”
AttributeAuthority
myProxy
tbd
VOMS
GSILCAS/
LCMAPS
Security
15
GAS & Package Manager
• Grid Access Service (GAS) Discovers and manages services on behalf of the user File and metadata catalogs already integrated
• Package Manager Provides application software at execution site Based upon existing solutions Details being worked out together with experiments and operations
16
Current Prototype
• WMS AliEn TaskQueue, EDG WMS,
EDG L&B (CNAF)
• CE (CERN, Wisconsin) Globus Gatekeeper, Condor-C,
PBS/LSF , “Pull component” (AliEn CE)
• WN 23 at CERN + 1 at Wisconsin
• SE (CERN, Wisconsin) External SRM implementations
(dCache, Castor), gLite-I/O
• Catalogs (CERN) AliEn FileCatalog, RLS (EDG),
gLite Replica Catalog
• Data Scheduling (CERN) File Transfer Service (Stork)
• Data Transfer (CERN, Wisc) GridFTP
• Metadata Catalog (CERN) Simple interface defined
• Information & Monitoring (CERN, Wisc) R-GMA
• Security VOMS (CERN), myProxy,
gridmapfile and GSI security
• User Interface (CERN & Wisc) AliEn shell, CLIs and APIs, GAS
• Package manager Prototype based on AliEn PM
17
Summary, plans
• Most Grid systems (including LCG2) are batch-job production oriented, gLite addresses distributed analysis Most likely will co-exist, at least for a while
• A prototype exists, new services are being added: Dynamic accounts, gLite CEmon, Globus RLS, File Placement
Service, Data Scheduler, fine-grained authorization, accounting…
• A Pre-Production Testbed is being set up more sites, tested/stable services
• First release due end of March 2005 Functionality freeze at Christmas Intense integration and testing period from January to March 2005
• 2nd release candidate: November 2005 May: revised architecture doc, June: revised design doc