e-Science & the GRID Where we are; Where we’re going; and What it means to IT Support
description
Transcript of e-Science & the GRID Where we are; Where we’re going; and What it means to IT Support
e-Science & the GRIDe-Science & the GRIDWhere we are; Where we’re going; Where we are; Where we’re going; and What it means to IT Supportand What it means to IT Support
Matthew J. DoveyMatthew J. DoveyTechnical ManagerTechnical Manager
Oxford e-Science CentreOxford e-Science CentreOUCSOUCS
What is e-Science?What is e-Science?e-e- Science GRID GRIDScience GRID GRID
In the GRID below you In the GRID below you will find four common will find four common phrases associated with phrases associated with the “grid”. the “grid”.
First Prize: £3.5M!First Prize: £3.5M!
From “OUCS Nudes”, 1 April 2003From “OUCS Nudes”, 1 April 2003
EE AA GG EE QQ MM NN KK
HH XX HH XX TT PP OO KK
YY QQ CC CC PP PP WW LL
EE XX CC II TT II NN GG
GG ZZ GG TT TT FF FF KK
JJ NN OO II DD II AA UU
CC YY DD NN ZZ ZZ NN ZZ
HH ZZ HH GG LL FF RR GG
ZZ ZZ ZZ ZZ ZZ ZZ ZZ ZZ
WW HH AA TT II SS II TT
What is e-Science?What is e-Science?
From “OUCS Nudes”, 1 April 2003From “OUCS Nudes”, 1 April 2003
EE AA GG EE QQ MM NN KK
HH XX HH XX TT PP OO KK
YY QQ CC CC PP PP WW LL
EE XX CC II TT II NN GG
GG ZZ GG TT TT FF FF KK
JJ NN OO II DD II AA UU
CC YY DD NN ZZ ZZ NN ZZ
HH ZZ HH GG LL FF RR GG
ZZ ZZ ZZ ZZ ZZ ZZ ZZ ZZ
WW HH AA TT II SS II TT
e-e- Science GRID GRIDScience GRID GRID
In the GRID below you In the GRID below you will find four common will find four common phrases associated with phrases associated with the “grid”. the “grid”.
First Prize: £3.5M!First Prize: £3.5M!
What really is e-Science?What really is e-Science?
““In the future, e-Science will refer to the large In the future, e-Science will refer to the large scale science that will increasingly be carried scale science that will increasingly be carried out through distributed global collaborations out through distributed global collaborations enabled by the Internet. Typically, a feature of enabled by the Internet. Typically, a feature of such collaborative scientific enterprises is that such collaborative scientific enterprises is that they will require access to very large data they will require access to very large data collections, very large scale computing collections, very large scale computing resources and high performance visualisation resources and high performance visualisation back to the individual user scientists.”back to the individual user scientists.”
http://www.research-councils.ac.uk/http://www.research-councils.ac.uk/escience/escience/
What really is e-Science?What really is e-Science?
““In the futureIn the future, e-Science will refer to the large , e-Science will refer to the large scale science that will increasingly be carried scale science that will increasingly be carried out through distributed global collaborations out through distributed global collaborations enabled by the Internet. Typically, a feature of enabled by the Internet. Typically, a feature of such collaborative scientific enterprises is that such collaborative scientific enterprises is that they will require access to very large data they will require access to very large data collections, very large scale computing collections, very large scale computing resources and high performance visualisation resources and high performance visualisation back to the individual user scientists.”back to the individual user scientists.”
http://www.research-councils.ac.uk/http://www.research-councils.ac.uk/escience/escience/
What What isis e-Science, now? e-Science, now?
– ““e-Science means science increasingly e-Science means science increasingly done through distributed global done through distributed global collaborations enabled by the Internet, collaborations enabled by the Internet, using very large data collections, terascale using very large data collections, terascale computing resources and high performance computing resources and high performance visualisation” visualisation” ((John Taylor, Director General of the John Taylor, Director General of the Research Councils, OSTResearch Councils, OST))
– ““e-Science will change the dynamic of the e-Science will change the dynamic of the way science is undertaken”way science is undertaken”
Errm, and that means?Errm, and that means?
• e-Science is basically “Science”e-Science is basically “Science”– (and the deplorable trend of putting “e-” in front of (and the deplorable trend of putting “e-” in front of
perfectly good English words)perfectly good English words)
• The “e-” refers to using electronic The “e-” refers to using electronic communications, the internet etc. for enabling communications, the internet etc. for enabling collaborative and distributed research.collaborative and distributed research.
• Oh, and its “big” (big data, big processing)!Oh, and its “big” (big data, big processing)!
So e-Science is e-mail?So e-Science is e-mail?
• Succesful science using electronic means Succesful science using electronic means or internet is “e-Science”or internet is “e-Science”
• So a group of physicists who would not So a group of physicists who would not normally work together, communicating normally work together, communicating by e-mail is a minimum metric.by e-mail is a minimum metric.
• But the e-Science vision is far more…But the e-Science vision is far more…
The vision!!!The vision!!!
Virtual Laboratories
Storage Devices
Search Engines
CPU Clusters
Remote Devices
Dangerousremote
facilities
Output devices
Databasese-Journal Archives
Knowledge repositories
People
Web Pages/Documents
User
The Vision (2)The Vision (2)
Snippets and code sharing
CommunitySupport
What is this GRID, then?What is this GRID, then?
• ““The Grid is a software infrastructure that enables The Grid is a software infrastructure that enables flexible, secure, coordinated resource sharing flexible, secure, coordinated resource sharing among dynamic collections of individuals, among dynamic collections of individuals, institutions and resources.” (institutions and resources.” (The GridThe Grid, eds. Foster & , eds. Foster & Kesselman)Kesselman)
• ““The Grid is an emergent infrastructure capable of The Grid is an emergent infrastructure capable of delivering dependable, pervasive and uniform delivering dependable, pervasive and uniform access to a set of globally distributed, dynamic and access to a set of globally distributed, dynamic and heterogeneous resources. It brings challenges of heterogeneous resources. It brings challenges of scalability, interoperability, fault tolerance, resource scalability, interoperability, fault tolerance, resource management and security.” (UK e-Science Director)management and security.” (UK e-Science Director)
Errr, and Errr, and thatthat means? means?
• The GRID is to Computing Resources as The GRID is to Computing Resources as the Web is to Documentsthe Web is to Documents
• The GRID is vision of a collection of The GRID is vision of a collection of resources (people, processors, storage, resources (people, processors, storage, scientific devices, information, scientific devices, information, knowledge, etc.) and the mechanisms to knowledge, etc.) and the mechanisms to access them online and remotelyaccess them online and remotely
• ““GRID” is often used to refer to the GRID” is often used to refer to the vision, the system and also enabling vision, the system and also enabling softwaresoftware
What is What is aa GRID? GRID?
• GRID is an enabling technology for online GRID is an enabling technology for online collaboration and discovery and access to collaboration and discovery and access to remote resourcesremote resources
• GRID is middlewareGRID is middleware• A number of GRID middleware technologies:A number of GRID middleware technologies:
– GlobusGlobus– CondorCondor– SRBSRB– Sun GRID EngineSun GRID Engine– etc.etc.
GRIDs Beyond e-ScienceGRIDs Beyond e-Science
• Apart from the depressing use of “e-”Apart from the depressing use of “e-”– e-social-sciencee-social-science– e-humanitiese-humanities– e-artse-arts– e-theology (deus ex machina?)e-theology (deus ex machina?)
• GRID is an enabling technology for collaborative GRID is an enabling technology for collaborative researchresearch
• GRID is an enabling technology for resource discovery, GRID is an enabling technology for resource discovery, access, useaccess, use
• GRID is an enabling technology for ubiquitous computingGRID is an enabling technology for ubiquitous computing• GRID has applications outside of just the science arenaGRID has applications outside of just the science arena
A Brief History of GRIDsA Brief History of GRIDs
• Ian Foster “The Grid: Blueprint for a New Ian Foster “The Grid: Blueprint for a New Computing Infrastructure” publish 1998Computing Infrastructure” publish 1998
• Series of GRID (or GRID-like) TechnologiesSeries of GRID (or GRID-like) Technologies– GlobusGlobus– CondorCondor– JINI/JavaSpacesJINI/JavaSpaces– SRBSRB– AccessGRIDAccessGRID
• No commonality; No common instrastructure; No commonality; No common instrastructure; No common vocabulary (some weren’t aware No common vocabulary (some weren’t aware of “GRID”); No common interfaces; No of “GRID”); No common interfaces; No intercommunicationsintercommunications
Enter the Open GRIDEnter the Open GRID
• To date different GRID technologies use To date different GRID technologies use different tools, API’s etc.different tools, API’s etc.
• GRID’s not interoperable!GRID’s not interoperable!
• Global GRID Forum (cf W3C)Global GRID Forum (cf W3C)– Open GRID Service ArchitectureOpen GRID Service Architecture– Open GRID Service InfrastructureOpen GRID Service Infrastructure– Based on WebServicesBased on WebServices– Adds O-O capabilities and foundation service Adds O-O capabilities and foundation service
(port)types(port)types– Globus 3 will be reference implementationGlobus 3 will be reference implementation
OGSIOGSI
• Open GRID Service InfrastructureOpen GRID Service Infrastructure
• Web Service basedWeb Service based
• Common vocabulary/conceptsCommon vocabulary/concepts
• Base “Object Oriented” types for Base “Object Oriented” types for GRIDServicesGRIDServices
OGSAOGSA
• Open GRID Service ArchitectureOpen GRID Service Architecture
• Builds on OGSIBuilds on OGSI
• Defines APIs/GRID Services forDefines APIs/GRID Services for– SchedulingScheduling– Management/MonitoringManagement/Monitoring– Database AccessDatabase Access– Etc.Etc.
Middleware ComponentsMiddleware Components
Current ImplementationsCurrent Implementations
• All beta’s and previews!All beta’s and previews!
• Globus (Java and C)Globus (Java and C)
• GridLite (Java)GridLite (Java)
• .Net (Two: Virginia and Edinburgh).Net (Two: Virginia and Edinburgh)
• Even PerlEven Perl
E-Science and GRIDs in E-Science and GRIDs in OxfordOxford• Oxford e-Science Centre based within OUCS Oxford e-Science Centre based within OUCS
and Comlaband Comlab– OUCS - Developing the infrastructure for e-OUCS - Developing the infrastructure for e-
Science in OxfordScience in Oxford– Comlab – Applications/Software EngineeringComlab – Applications/Software Engineering
• Based within RTSBased within RTS– Develop the use of e-Science as a research Develop the use of e-Science as a research
technologytechnology– Develop the use of GRID as an enabling Develop the use of GRID as an enabling
technology for researchtechnology for research
OeSC Systems & ServicesOeSC Systems & Services• Certificate VerificationCertificate Verification
(http://ca.grid-support.ac.uk)(http://ca.grid-support.ac.uk)
• Gatekeepers (Redhat Linux)Gatekeepers (Redhat Linux)
• Sun GRIDEngineSun GRIDEngine
• Condor clusterCondor cluster
• Link to OSCLink to OSC
• Bidding for a large JISC Bidding for a large JISC compute clustercompute cluster
Oxford firewall
GRID
IBM
TOSCA
OSC
Condor (Head node)
Condor
Dedicated 1GB connection
Gatekeeper
Software (dynamic)firewall
OeSC Technical ExpertiseOeSC Technical Expertise• GlobusGlobus• X509X509• MetaDirectory ServicesMetaDirectory Services• AuthenticationAuthentication• FirewallsFirewalls• WebServices & .NetWebServices & .Net• Java & Java COGJava & Java COG• Sun GRIDEngineSun GRIDEngine• PortalsPortals• Service/resource metadata and discoveryService/resource metadata and discovery• AccessGRID – real time communicationsAccessGRID – real time communications
Current OeSC ProjectsCurrent OeSC Projects
•VideoWorksVideoWorks
•Remote MicroscopyRemote Microscopy
•GRID Workload GRID Workload ManagementManagement
•Climate PredicationClimate Predication
•GeoVis/GeoDiseGeoVis/GeoDise
•DAMEDAME
•Reality GRIDReality GRID
•e-Diamonde-Diamond
•Biomolecular simulationsBiomolecular simulations
•Structural BiologyStructural Biology
•Security for the EU Security for the EU DataGridDataGrid
•DCOCEDCOCE
OeSC Within OUCS/RTSOeSC Within OUCS/RTS• Authentication DevelopmentsAuthentication Developments
– DCOCEDCOCE• Portal DevelopmentsPortal Developments
– Need a portal to bring together the virtual project teamsNeed a portal to bring together the virtual project teams– Based on uPortal (UK expertise in developing Based on uPortal (UK expertise in developing
GRID/Certificate based services for uPortal)GRID/Certificate based services for uPortal)• Collaborative workingCollaborative working
– Access GRID (node at Comlab; nodes planned at Access GRID (node at Comlab; nodes planned at Churchill and Begbroke; have desktop version running)Churchill and Begbroke; have desktop version running)
• Virtual TeamsVirtual Teams– Management and communications across distributed Management and communications across distributed
teams (OII)teams (OII)• Resource DiscoveryResource Discovery
– GRID as a new toolGRID as a new tool– GGF GIR Working Group (OULS/SERS)GGF GIR Working Group (OULS/SERS)
• Support for OpenSource and Open StandardsSupport for OpenSource and Open Standards– GTR, GGF, OASIS, etc.GTR, GGF, OASIS, etc.
Futures – Interdisciplinary Futures – Interdisciplinary HubHub• e-Science Hub physical location being created in e-Science Hub physical location being created in
the Universitythe University– Nerve centre for e-ScienceNerve centre for e-Science
• ‘‘Independent’, not a separate department…Independent’, not a separate department…• Located in close proximity to Computing LaboratoryLocated in close proximity to Computing Laboratory• Place to build teams, receive visitors, give lectures, overflowPlace to build teams, receive visitors, give lectures, overflow• Hub for nurturing new collaboratoriesHub for nurturing new collaboratories
– Form new interdisciplinary centre for the UniversityForm new interdisciplinary centre for the University
• Two dedicated academic staff:Two dedicated academic staff:– Dual role: core staff for Hub, complement expertise of e-Dual role: core staff for Hub, complement expertise of e-
Science teamScience team• Professor – database expert; established through Computing Professor – database expert; established through Computing
Laboratory Laboratory • UL – Web services; Software EngineeringUL – Web services; Software Engineering
– Aim for external funding (Development Office); strong support Aim for external funding (Development Office); strong support from Comlabfrom Comlab
• Externally funded joint staffExternally funded joint staff– EG NERC fellowship; department/OeSC shared postEG NERC fellowship; department/OeSC shared post– Joint with CCLRC?Joint with CCLRC?
OeSC in the UK GRID and OeSC in the UK GRID and beyondbeyond• Work on the “Level 2 GRID”Work on the “Level 2 GRID”
– Lead for GRID Security within the ETFLead for GRID Security within the ETF• Organised workshop on firewallsOrganised workshop on firewalls• Proposed various solutions: host database, dynamic firewall, Proposed various solutions: host database, dynamic firewall,
VPNVPN• Developed prototype dynamic firewall scriptDeveloped prototype dynamic firewall script• Developing a prototype trusted host database systemDeveloping a prototype trusted host database system
– Helped develop integration test scriptsHelped develop integration test scripts
• Globus 3 and the OGSA GRIDGlobus 3 and the OGSA GRID– Town Meeting on requirements and plans for the OGSA Town Meeting on requirements and plans for the OGSA
GRIDGRID– Working on Web Service definitions for upcoming projectsWorking on Web Service definitions for upcoming projects– Working on Web Service to GRID Service migration pathsWorking on Web Service to GRID Service migration paths
• Perceived to be easier than GT2 to GT3 migrationPerceived to be easier than GT2 to GT3 migration• Working with the OASIS WSDM TCWorking with the OASIS WSDM TC
– Working with RAL and Manchester on UDDI registries for Working with RAL and Manchester on UDDI registries for GRIDGRID• Working with OASIS UDDI TC on GRID RequirementsWorking with OASIS UDDI TC on GRID Requirements
UK National GRIDsUK National GRIDsGT2GT2 OGSAOGSA
Level 1Level 1Enthusiasts Enthusiasts PrototypingPrototyping
20022002 Current PhaseCurrent Phase
Level 2Level 2Functional with Functional with Some Real Some Real ApplicationsApplications
April 2003April 2003 ????
Level 3Level 3Useable GRIDUseable GRID
???? ????
…… …… ……
Impact on IT Support – Hand Impact on IT Support – Hand HoldingHolding
– Getting and using certificatesGetting and using certificates•Getting the certificates isn’t easy (but Getting the certificates isn’t easy (but
improving)improving)
•Getting the certificates into tools isn’t easy Getting the certificates into tools isn’t easy (but improving)(but improving)
•Certificates don’t roam easilyCertificates don’t roam easily
– Client certificate authenticationClient certificate authentication•Easier on some platforms than othersEasier on some platforms than others
Impact on IT – Network Impact on IT – Network BandwidthBandwidth
• E-Science Potentially involved large E-Science Potentially involved large amounts of dataamounts of data
• Both data and jobs can move so Both data and jobs can move so some optimisationsome optimisation
• Many projects think their data is Many projects think their data is “big” but actual bandwidth used is “big” but actual bandwidth used is not too bad (e.g 6GB per day)not too bad (e.g 6GB per day)
Impact on IT Support – Impact on IT Support – Locating and Accessing ResourcesLocating and Accessing Resources
• Currently a programmers Currently a programmers fieldfield– Some very nice Some very nice
technologies in technologies in WebServices (Visual Studio WebServices (Visual Studio .Net, WebServices for .Net, WebServices for Office, JBuilder, Elipse etc.)Office, JBuilder, Elipse etc.)
– GRIDService technologies GRIDService technologies very immaturevery immature
• The FutureThe Future– Dedicated workbenchesDedicated workbenches– Drag and Drop interfacesDrag and Drop interfaces– Portals (see portal Portals (see portal
workshop mid July – workshop mid July – http://www.nesc.ac.uk)http://www.nesc.ac.uk)
Impact on IT Support –Impact on IT Support –SecuritySecurity
• Globus 2 uses port mapping technologiesGlobus 2 uses port mapping technologies– OeSC within ETF looking at potential solutions using OeSC within ETF looking at potential solutions using
VPNs, dynamic application aware firewalls etc.VPNs, dynamic application aware firewalls etc.– GRID/Web Service over https will hopefully be a GRID/Web Service over https will hopefully be a
solutionsolution• Some GRID infrastructures use Peer 2 Peer Some GRID infrastructures use Peer 2 Peer
technologiestechnologies• Collaborative Virtual Organisations are based Collaborative Virtual Organisations are based
on realtime chat, file sharing, messenger on realtime chat, file sharing, messenger applications etc.applications etc.
• Transmission of jobs – mobile codeTransmission of jobs – mobile code– Potential use of JVM style sandbox technologiesPotential use of JVM style sandbox technologies
Impact on IT Support - Impact on IT Support - AccessGRIDAccessGRID
• Large scale multicast Large scale multicast video-conferencingvideo-conferencing– AccessGRID 1.xAccessGRID 1.x
• Little flakeyLittle flakey– AccessGRID 2.xAccessGRID 2.x
• More robust – but More robust – but certificate basedcertificate based
•ImpactsImpacts
• AccessGRID rooms need planning (location of screens, AccessGRID rooms need planning (location of screens, projectors, microphones)projectors, microphones)• Network bandwidth (support of multicast)Network bandwidth (support of multicast)• Personal AccessGRID nodes will need hand-holdingPersonal AccessGRID nodes will need hand-holding
Assistance to IT Support –Assistance to IT Support –GRID Monitoring & GRID Monitoring & ManagementManagement• We have GITS (GRID Integration Test We have GITS (GRID Integration Test
Scripts) for testing point to point Scripts) for testing point to point interaction of GRID Servicesinteraction of GRID Services
• Planned UK Open Middleware Planned UK Open Middleware Infrastructure Institute will provide Infrastructure Institute will provide interoperable testbedsinteroperable testbeds
• GRID Monitoring systems for monitoring GRID Monitoring systems for monitoring performance, network bandwidth etc.performance, network bandwidth etc.
Futures of Monitoring and Futures of Monitoring and Management – OASIS WSDMManagement – OASIS WSDM• New OASIS TC (started up last month)New OASIS TC (started up last month)• Based on previous floundered OASIS TCBased on previous floundered OASIS TC
• Defining management of distributed Defining management of distributed resources USING Web servicesresources USING Web services
• Defining management OF Web services Defining management OF Web services operations and WSDL. operations and WSDL.
• Collaborate with W3C, GGF, DMTF, OASISCollaborate with W3C, GGF, DMTF, OASIS• Chairs: Heather Kreger (IBM, Chair of WSA Chairs: Heather Kreger (IBM, Chair of WSA
MTF) & Winston Bumpus (Novell, DMTF MTF) & Winston Bumpus (Novell, DMTF President)President)
Web Service Architecture – Web Service Architecture – MTF MTF (http://www.w3c.org)(http://www.w3c.org)• Defining the manageability characteristics of the Defining the manageability characteristics of the
architectural elements of the Web Services architectural elements of the Web Services architecture, i.e.:architecture, i.e.:– IIdentification - data that uniquely identifies the elementdentification - data that uniquely identifies the element– Status - information about operational state of a element Status - information about operational state of a element
(up: busy/idle; down: stopped/saturated/crashed)(up: busy/idle; down: stopped/saturated/crashed)– Configuration - a collection of behavioural properties Configuration - a collection of behavioural properties
which may be changed (persistent over instances)which may be changed (persistent over instances)– Metrics - raw atomic, unambiguous information for Metrics - raw atomic, unambiguous information for
managmement purposesmanagmement purposes e.g. response times e.g. response times– Operations - methods that control or help manage the Operations - methods that control or help manage the
entity (instance specific)entity (instance specific)– Events - changes in the state of the entity e.g a lifecycle Events - changes in the state of the entity e.g a lifecycle
state change, or a state change.state change, or a state change.
DMTFDMTF(http://www.dmtf.org)(http://www.dmtf.org)
Models real world managed objects (WBEM, Models real world managed objects (WBEM, CIM). Large existing model (not in web/grid CIM). Large existing model (not in web/grid format/granularity)format/granularity)
• Application Working GroupApplication Working Group– Intends to model management of web services Intends to model management of web services
• Interoperability Working Group Interoperability Working Group – Defining a CIM/SOAP protocol in WSDL: Defining a CIM/SOAP protocol in WSDL:
CIM/Ops as WSDL operations and xmlCIM as CIM/Ops as WSDL operations and xmlCIM as the body of SOAP messages over HTTPthe body of SOAP messages over HTTP
GRIDGRID
• OGSA Working Group on GRID OGSA Working Group on GRID Management and MonitoringManagement and Monitoring
– Object Oriented InheritenceObject Oriented Inheritence– GRIDServices will be self describingGRIDServices will be self describing– GRIDServices will be self managedGRIDServices will be self managed
Web Services stackWeb Services stack
Man
agem
ent
Secu
rity/Trust/P
rivacy
Qu
ality of S
ervice
Manageability portTypes - OASIS WSDM, GGF CMM, DMTF
Management requirements for a Manageable Web Services ArchictureW3C WS Arch WG, Management TF
Web service based accessto management data - OASIS WSDM, GGF CMM
Mangeability of Web ServicesOASIS WSDM
Questions & DiscussionQuestions & Discussion