ATF Progress Report 8/3/2001 Steve Fisher / RAL. 8 March 2001ATF report - Steve Fisher/RAL2 Who are...
-
Upload
hugh-manning -
Category
Documents
-
view
219 -
download
2
Transcript of ATF Progress Report 8/3/2001 Steve Fisher / RAL. 8 March 2001ATF report - Steve Fisher/RAL2 Who are...
8 March 2001ATF report - Steve Fisher/RAL 2
Who are the ATF• Francesco Giacomini – WP1• Wolfgang Hoschek – WP2• Steve Fisher – WP3 – acting chair• German Cancio – WP4• Tim Folkes – WP5• Brian Tierney – Consultant
• Ingo Augustin – WP8,9 and 10• Dave Kelsey – Security• Ian Foster – Consultant• Carl Kesselman – Consultant• Fabrizio Gagliardi
8 March 2001ATF report - Steve Fisher/RAL 3
Meetings• A few half days last year
– Not much achieved
• A workshop of a week with Ian Foster and Carl Kesselman– Identified some issues
• 2 day session at beginning of February– Beginning to work well together
• 2 day session at beginning of March• Half day yesterday• Plan to continue 1.5 days / 2 weeks
8 March 2001ATF report - Steve Fisher/RAL 4
Why do we exist?• To define a viable architecture in terms of a set of
components• To ensure that these are a useful set of components
able to interwork to meet the (evolving) requirements• To alert the PTB to components that appear not to be
fully covered by the various WPs.• For each deliverable, collect input from the WPs and
produce a document showing the essential functionality. Inform the PTB if it appears that a component will not have the functionality required by another component. ????
• Make proposals to the PTB on certain technical matters.
8 March 2001ATF report - Steve Fisher/RAL 5
First architecture & M9 documents• We have the first version of a document which
defines the architecture of the EU DataGrid. – This is a large undertaking – Unlike other software projects we have worked on, many
parts of the system are new (to us).– We are not just re-painting and re-assembling old
components but planning to build something new.
• Consequently for now we can only describe what we see as a feasible architecture for the DataGrid. – The new components will have to be prototyped – some will
just not work.
• The document will evolve during the lifetime of the project.
• Similar comments relate to the M9 document
8 March 2001ATF report - Steve Fisher/RAL 6
Standards• We want to be able to use existing standards where
appropriate• We recognise the importance of being able to
interwork with similar projects – especially HEP related ones
• We consider the work of the GGF to be very important as a genuine grassroots attempt to define Grid standards. We plan to work with the GGF both by contributing to the standardisation work and by moving towards these standards as they evolve.
• Decent standards only arrive when you can choose from a range of solutions
8 March 2001ATF report - Steve Fisher/RAL 7
Plan• Produce model of system (UML)• Define services – then check if there is obvious
mapping to an existing WP– Most architectural components are services
• Write an architecture/design document based on Model
• Consider month 9 functionality
As we proceeded various issues appeared and we made tentative decisions…
8 March 2001ATF report - Steve Fisher/RAL 8
GridServices• The system is made up of a set of GridServices. • Each service is accessed via a port as is the case
with http or ftp.• A service is defined by the protocol it speaks.
However the normal programmer does not see the protocol itself but accesses it through a client API. In defining the system we choose to think first of the functionality (API) and then of the protocol needed to communicate via that API to the service. We have defined these APIs as Java calls.
• We have also gone inside each service a little to outline how they work and to demonstrate that they can interwork to achieve the desired functionality.
8 March 2001ATF report - Steve Fisher/RAL 9
No built in restrictions
• Restrictions should not be built into the architecture nor the code which restrict access to the various GridServices.
• GridServices each have a policy which may control the access (from a job or user).
8 March 2001ATF report - Steve Fisher/RAL 10
Masters and Replicas• Files (normally) identified by a logical name
– Expressed as a URL
• The ReplicaManager manages the ReplicaCatalog which knows about all replicas of a (physical) Master file
• GridScheduler consults the ReplicaManager– Decides to use an existing replica (or the master)– Or to make a new replica
• The user does not manage the replicas, the ReplicaManager does
8 March 2001ATF report - Steve Fisher/RAL 11
Security• We have so far not thought much about
security• We now have a link to the Security group
within WP6 – via Dave Kelsey• Plan to improve in this area
8 March 2001ATF report - Steve Fisher/RAL 12
XML and https• As a general way to develop new services we
favour using XML over http(s)• http(s) looks after the (secure) sending of
some message and getting back the answer• The XML looks after encoding the data in
standard way.• For example SQL can be embedded within
XML – for which there is a “standard” already in existence.
8 March 2001ATF report - Steve Fisher/RAL 13
Globus etc.• It is our goal to define an architecture which is
essentially independent of Globus but which can easily be built using components of Globus.
• M9 prototype will be mainly Globus based.• We recognise that we need compatibility with
other projects, yet we also need to be able to develop our own work. – Our architecture document is and will be
independent but refers and will refer to other documents where appropriate.
8 March 2001ATF report - Steve Fisher/RAL 14
Tools• Publish .ps and .pdf • Avoid tools not available on Linux • Try Together as a UML CASE tool.
– Encourage use of UML
8 March 2001ATF report - Steve Fisher/RAL 15
Platforms and Languages• Middleware will run on i386 and IA64 Linux
and on Sparc Solaris. For the desktop Linux and Solaris and a web interface will be supported.
• WPs 1-5 APIs will be written to support: Java and C, with Python and Perl via swig.
• WPs 1-5 are encouraged to consider the benefits of the more modern languages such as Java and Python when implementing new code and to note that mixing Java and C++ is difficult!
8 March 2001ATF report - Steve Fisher/RAL 16
GridServices• ComputingElement• StorageElement• SQLDatabaseService• Information and Monitoring • ServiceIndex• GridScheduler• FileCopier• ReplicaCatalog• ReplicaManager• SoftwareRepositoryService• Network
• Seeking interfaces
• Simple dependencies
• Checking interactions
8 March 2001ATF report - Steve Fisher/RAL 17
A jobExecutable= /usr/local/atlas.shRequirements = TS >= 1GBInput.LFN = http://atlas.hep/foo.inargv1 = Input.LFNOutput.LFN= http://atlas.hep/foo.outOutput.SE = http://datastore.rl.ac.uk/argv2 = Output.LFN
#!/bin/shgridcp $1 ~/tmp1grep higgs ~/tmp1 > ~/tmp2gridcp ~/tmp2 $2
8 March 2001ATF report - Steve Fisher/RAL 18
SubmitJob
8 March 2001ATF report - Steve Fisher/RAL 19
ComputingElement• The interface to computing power. Its
primary functionality is, given a job description, to take that job and undertake to run it.
• The Fabric Coordination Manager (FCM) interfaces the different kinds of Local Resource management System (LRMS) such as PBS, LSF or Condor.
• FCM tailored to the LRMS to provide missing functionality
WP1
WP4
CE
FCM
LRMS
8 March 2001ATF report - Steve Fisher/RAL 20
StorageElement• The Storage Element
(SE) will have three interfaces. The first will be a file access interface. This will provide the basic file access methods such as get, put and delete
• The second two interfaces are optional, but provide further functionality.
8 March 2001ATF report - Steve Fisher/RAL 21
SQLDatabaseService• Very recently defined service• Store & retrieve meta data (SQL insert,
delete, update, query)• Build from standardized commodity
components• Use SQL in XML over https• Can use with any local or remote RDBMS
(MySQL, Oracle, DB2, ...)
8 March 2001ATF report - Steve Fisher/RAL 22
Information and Monitoring• Base on GMA from GGF• Choose Consumer/Producer
protocol• Choose registration protocol• 2 prototypes planned:
– LDAP based (pull only)– Relational– Some mixture??
Consumer
Producer
Registry
8 March 2001ATF report - Steve Fisher/RAL 23
ServiceIndex• For the construction of a distributed “web” of
services• Soft state service registration, simple service
lookup• Query engine can crawl the web of services,
to provide better query support.
8 March 2001ATF report - Steve Fisher/RAL 24
GridScheduler• This service offers reliable job submission,
where reliable means that if the job fails for reasons which are independent of the job, it is rescheduled.
• To facilitate this an appropriate job monitoring has to be performed.
• Main task is to find the right SE and CE to use.
8 March 2001ATF report - Steve Fisher/RAL 25
FileCopier• Transfer data securely from one physical
location to another one. • Initially file is transfer unit• Consider using globus-url-copy
8 March 2001ATF report - Steve Fisher/RAL 26
ReplicaCatalog• Used by the ReplicaManager• Mapping of logical file name one or more
physical file names. – A replica catalog contains zero or more logical file
collections– Each file collection contains zero or more logical
files– Each logical file contains one or more physical
files.
• Requests will be sent as XML over https.
8 March 2001ATF report - Steve Fisher/RAL 27
ReplicaManager• Knows about file replicas (through
ReplicaCatalogue) • Consistent replica creation, selection, moving,
deletion are managed.• Normally file deletion requests go through the
replica manager.
8 March 2001ATF report - Steve Fisher/RAL 28
SoftwareRepositoryService• The applications need to be available within the CEs
so that they can be run. • Part of the environment may be provided by the CE.• Need a service which is able to deal with inter-
dependencies and set up the environment and the application so that it can run.
• To do this efficiently will require some kind of caching mechanism – Expect to make use of the data management services. – Caches of old programs must be eliminated promptly.
8 March 2001ATF report - Steve Fisher/RAL 29
Network• Future network capabilities are expected to
include the ability to ask for various Quality of Service (QoS) levels, and the ability to make advanced reservations.
• These advanced network capabilities will be incorporated as they become available.
8 March 2001ATF report - Steve Fisher/RAL 30
SubmitJob
8 March 2001ATF report - Steve Fisher/RAL 31
Services @ M9
CE SE SQLDataBase
ReplicaManager
ReplicaCatalogue
FileCopier
GridScheduler
ServiceIndex
IMS
Resources
Collective Services
8 March 2001ATF report - Steve Fisher/RAL 32
ComputingElement @ M9• Front End (WP1)
– The ComputingElement will be based on Globus GRAM. The associated information service to which we publish for this release will be Globus GRIS.
– Additions and modifications to the type of data published to the GRIS will be carried out.
• Back End (WP4)– Interim installation system built on existing tools
selected during a tools survey. • provides the necessary functionality for installing and
configuring computing nodes and for installing/updating system software packages on them.
8 March 2001ATF report - Steve Fisher/RAL 33
StorageElement @ M9• Provide a first release of the API that will work
with Castor at CERN.
8 March 2001ATF report - Steve Fisher/RAL 34
SQLDatabaseService @ M9• The core functionality: SQL insert, delete,
update and query will be available.• Functionality available from a command line
tool, a web browser and via an API• Testing will be done only on MySQL and
Oracle.
8 March 2001ATF report - Steve Fisher/RAL 35
Information and Monitoring @ M9
• Relational– We plan to have a rudimentary
system based on the Relation Model by Month 9
– Producer and Consumer API and basic code library supporting both single and streaming data requests.
– The directory service (to register producers) will initially be based on SQL or the ServiceIndex.
– Hope to be far enough to consider whether to continue or abandon Relational approach
• LDAP– Producer and
Consumer API and basic code library supporting only single data requests.
– The directory service (to register producers) will initially be based on LDAP or the ServiceIndex.
• Presentation– Very rudimentary
presentation of monitoring data.
8 March 2001ATF report - Steve Fisher/RAL 36
ServiceIndex @ M9• All functionality will be provided in a
prototype.
8 March 2001ATF report - Steve Fisher/RAL 37
GridScheduler @ M9• The first release of the Grid Scheduler addresses the
submission and monitoring of batch jobs.• A command-line user interface will allow a user to:
– submit a job to the Grid Scheduler;– monitor its execution;– remove it.
• The description of a job (application, input and output data, other requirements) will be expressed in a Job Description Language (JDL). – The first version of the JDL will be based on the Condor
ClassAds.
• The first prototype will use the Condor matchmaking library.
8 March 2001ATF report - Steve Fisher/RAL 38
FileCopier• All functionality will be provided.
8 March 2001ATF report - Steve Fisher/RAL 39
ReplicaCatalog• By Month 9, the Globus LDAP based
implementation may be preferable. Later, an RDBMS model may be used. All functionality described in the Architecture document will be provided.
8 March 2001ATF report - Steve Fisher/RAL 40
ReplicaManager• By month nine, user directed replica creation
and deletion will be available.
8 March 2001ATF report - Steve Fisher/RAL 41
Areas poorly covered• Implementation of managed interface of the
SE• Security – especially authorisation
– But now we have Dave Kelsey
• Accounting– WP1?
• SoftwareRepositoryService
8 March 2001ATF report - Steve Fisher/RAL 42
Worries• The nature of replication
– Is it just a file copy?– Can we do anything for R/W data bases?
• “GridMap” file– Scaleable mechanism to do authorisation world
wide – taking into account complex policies imposed primarily by:
• Countries• Experiments
– Mechanism to relate a file to its owner…
• Can we develop our prototypes fast enough to convince the world of our sanity?
8 March 2001ATF report - Steve Fisher/RAL 43
Final remarks• Version 1 of the documents are available• They only represent a snapshot of a partial
design• As promised in the proposal – there is a lot of
innovation planned• Need to prototype• Tension between providing a usable M9
deliverable and looking to the goal.• ATF will now work on Version 2…