Grid and VOs
description
Transcript of Grid and VOs
Grid and VOs
Grid from 10 000 feet
The GRID: networked data processing centres and ”middleware” software as the “glue” of resources.
Researchers perform their activities regardless geographical location, interact with colleagues, share and access data
Scientific instruments, libraries and experiments provide huge amounts of
data
based on material from [email protected] and the 3D RTM map by Gidon Moont, IC and GridPP
What is Grid?
The word ‘grid’ has been used in many ways cluster computing cycle scavenging cross-domain
resource, data and information sharing
A definition for what we mean with grid• Coordinates resources not subject to centralised control
• Using standard, open and generic protocols & interfaces
• Provides non-trivial qualities of collective service
Definition source: Ian Foster in Grid Today, July 22, 2002; Vol. 1 No. 6, see http://www-fp.mcs.anl.gov/~foster/Articles/WhatIstheGrid.pdf
Grid Computing: “More Than One”
• More than one machine• More than one user• More than one research community• More than one administrative domain• More than one geographical location
General case: more than one of each!!!
Consequences of Plurality
• More than one user / research community Partitioning of resources, authentication, authorization, accounting
• More than one machine Software engineering, distributions
• More than one administrative domain / research community Authentication / authorization, non-invasive installations, genericity
• More than one admin domain, geographical location Worldwide operations coordination
Grid characteristics
Things in e-Science grids that may contrast with other distributed efforts
• collaboration of individuals from different organisations most of the scientific grid communities today consist of people
‘scattered’ over many home organisations … in many cases internationally
‘Virtual organisations’ – but that’s what we are used to as scientific collaborations!
• delegation – services acting on your behalf – are an integral part of the architecture for service and data brokering integrating compute, data access, and databases in
the same task unattended work flows
Virtual Organisations
A set of individuals or organisations, not under single hierarchical control, (temporarily) joining forces to solve a particular problem at hand, bringing to the collaboration a subset of their resources, sharing those at their discretion and each under their own conditions.
• Users are usually a member of more than one VO• Any “large” VO will have an internal structure,
with groups, subgroups, and various roles
Virtual organisation structure
Lots of overlapping groups and communities
graphic: OGSA Architecture 1.0, OGF GFD-I.030
Virtual vs. Organic structure
• Virtual communities (“virtual organisations”) are many• An individual will typically be part of many communities
has different roles in different VOs (distinct from organisational role)
all at the same time, at the same set of resources but will need single sign-on across all these communities
graphic: OGSA Architecture 1.0, OGF GFD-I.030
Expressing collaboration
• provide the means to express collaboration membership groups and roles organisation management tools
• support access control as function of VOs access control as a function of VO, group, and role both at the service and at the content level
• maintain autonomy sharing defined by access controls at the source no need to hand off the actual data to a third party
VL-e PoC
Grid Middleware
Networking Network Service (lambda networking)
Application specificservice
Application Potential
Generic service &
Virtual Lab. services
Grid &
NetworkServices
Virtual Laboratory
VL-e Experimental EnvironmentVL-e Proof of concept Environment
App1 App 2 App 3
Virtual Lab.rapid prototyping
(interactive simulation)
Additional Grid Services
(OGSA services)
PoC Position in the VL-e structure
The VL-e PoC: Proof-of-Concept
What is the PoC Environment?
• A shared, common environment, • where different tools and services are • both used and• provided by the VL-e community
basis for subsequent application development
Elements in the PoC
The PoC refers to three distinct elements
PoC Software Distribution set of software that is the basis for the applications both grid middleware and virtual lab generic software
PoC Environment the ensemble of systems running the Distribution including user desktops or local clusters and storage
PoC Central Facilities those systems with the PoC Distribution centrally managed
for the entire collaboration large-scale computing, storage and hosting resources
PoC Distribution
The PoC distribution contains components to
• enable service-oriented development• enable application development• provide access to data, computing, and storage,
distributed geographicallydriven by specific VL-e application scenarios
Work flow to be the integrative layer of VL-e functionality should be invocable as a service work flow (graphical) systems help in composition
but are not the only way to interact with services
The PoC software distribution
The PoC software suite. the following elements of this suite can be distinguished:
Grid foundation middleware; the basic software that is based on interfaces and concepts that are internationally adopted. This includes elements such as the security model, resource allocation interface, … based on EGEE middleware suite
Generic Virtual Laboratory software; the software developed within the project for the PoC.
Services imported from outside; given that not all services are necessarily developed within VL-e, components have been imported.
Associated installation and deployment tools; the PoC suite is installed on the central facilities and (where applicable) also available for distributed installation.
The PoC software distribution
• Software environment geared towards application software developers enables cross-leveraging VL-e developments
between applications predictable lifecycle management
• Primary metric is the effectiveness in addressing real cross-application needs PoC is liberal in including software as long as it is useful for multiple domains does not compromise integrity can be supported and safely deployed
Defining content of the PoC Distribution
Application development
Matrix clustersCentral Storage (SRB, dCache/SRM)
Distributed Clusters, SURFnetInitial compute
platform
Stable, reliable, supportedreleases of the Grid MW &
VL-software
VL-e Proof of Concept EnvironmentVL-e Rapid Prototyping Environment
DAS-2, local resources
VL-e Certification Environment
NL-Grid FabricResearch Cluster
Test & Cert.Grid MW & VL-software
Compatibility
Flexible test environment
Environments
Typical usage
‘keywords’
Virtual Lab.rapid prototyping
Flexible, ‘unstable’
Download RepositoryPoC Installer
Common repositoryIntegration tests
stable, tested releases
Tagged Release Candidates
External software
VLeIT Recommendation Point
Working with the application developers
Each generic component has an ‘expert’ on VLeIT• to work on its optimal use or deployment and• coordinate enhancement requests
Latest developments from within the VL-e project• availability via a fast-lane ‘contrib’ trajectory• same installation mechanism• but supported directly by the developers
addressing the chicken-and-egg dead lock
The VL-e PoC Distribution
What is the VL-e PoC Distribution?
The PoC distribution is • meant to be installed on a RedHat Enterprise Linux
3 compatible system• a stable base environment, with managed releases
The PoC distribution contains components to• enable service-oriented development• enable application development• provide access to
computing, storage, and information systems
The VL-e PoC DistributionVL-e PoC Release 1.0 Contents:
gLite 3.0
Sun Java2SDK 1.4.2_12
PlusJavaGAT-1.5 MatlabMPI-1.2 Mesa3D-6.4.1R-2.2.0 Rmpi-0.5 SRB-client-3.4.0SRB-devel-3.4.0 fsl-bin-3.2 fsl-devel-3.2gat-adaptors-1.8.2 gat-cpp-wrapper-1.8.2 gat-engine-1.8.2gat-python-wrapper-1.8.2 globus-toolkit-4.0.1 graphviz-2.8ibis-1.2.1 itk-2.4.1 kepler-1.0.0alpha7lam-devel-7.1.2 lam-docs-7.1.2 lam-extras-7.1.2lam-runtime-7.1.2 libRmath-2.2.0 libRmath-devel-2.2.0medline-1.0 mpitb-2.1.72 mricro-1.3.9-4nimrod-3.0.1 octave-2.1.72 ogsadai-wsrf-2.1paraview-2.4.2 pl-5.6.4-200 sesame-client-1.2.3taverna-workbench-1.3.1 triana-3.2 vtk-4.4
The VL-e PoC DistributionDistribution formats:
Network-based installationo http accesso http proxy access
DVD-based installation: the PoC DVD Pre-installed VMware image (present on PoC
DVD)o CentOS 3 with GNOME GUIo gLite UIo VL-e Release 1.0 UI packageso Works with free VMware Player on both Linux and
Windows
PoC Environment
• All systems can be used to perform the application scenarios, using the PoC distribution
• Installed both at specific central facilities on desktops, remote clusters, data servers
PoC Central Facilities
• For applications in the Netherlands both applications within VL-e and others shared common infrastructure accessible via grid middleware has of course PoC distribution installed
• Location and capacity SARA –
tape (~1.2 TB), disk storage (~100 TB), clusters (~1400 cores Debian, 60 RHEL3),database servicesuser interface gateway catch-all
NIKHEF – disk storage (~25 TB), clusters (550 cores RHEL3)
PoC Central Facility Usage Today
SARA LISA Occupancy
PHICOS production jobs on the PoC (NDPF) at NIKHEF
PoC (NDPF) shared between various applications
grey: VLEIBU, VLEMED; green ATLAS, blue: LHCb