Using Grid Computing David Groep, NIKHEF 2002-07-15.
Transcript of Using Grid Computing David Groep, NIKHEF 2002-07-15.
Using Grid Computing
David Groep, NIKHEF2002-07-15
Physics @ CERN• LHC particle accellerator
• operational in 2007
• 5-10 Petabyte per year
• 150 countries
• > 10000 Users
• lifetime ~ 20 years
level 1 - special hardware
40 MHz (40 TB/sec)
level 2 - embeddedlevel 3 - PCs
75 KHz (75 GB/sec)5 KHz (5 GB/sec)100 Hz(100 MB/sec)data recording &
offline analysis
The Grid, But Why?
CPU & Data RequirementsEstimated CPU Capacity at CERN
0
500
1,000
1,500
2,000
2,500
3,000
3,500
4,000
4,500
5,000
1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010
year
K S
I95
Moore’s law – some measure of the capacity technology advances provide for a constant number of processors or investment
Jan 2000:3.5K SI95
LHC experimentsOther experiments
< 50% of the main analysis capacity will be at CERN
Estimated CPU capacity required at CERN
More Reasons Why
ENVISAT• 3500 MEuro programme cost3500 MEuro programme cost
• 10 instruments on board10 instruments on board• 200 Mbps data rate to ground200 Mbps data rate to ground• 400 Tbytes data archived/year400 Tbytes data archived/year• ~100 `standard’ products~100 `standard’ products• 10+ dedicated facilities in Europe10+ dedicated facilities in Europe
• ~700 approved science user projects~700 approved science user projects
• 3500 MEuro programme cost3500 MEuro programme cost
• 10 instruments on board10 instruments on board• 200 Mbps data rate to ground200 Mbps data rate to ground• 400 Tbytes data archived/year400 Tbytes data archived/year• ~100 `standard’ products~100 `standard’ products• 10+ dedicated facilities in Europe10+ dedicated facilities in Europe
• ~700 approved science user projects~700 approved science user projects
And More …
•For access to data
–Large network bandwidth to access computing centers
–Support of Data banks replicas (easier and faster
mirroring)
–Distributed data banks
•For interpretation of data
–GRID enabled algorithmsBLAST on distributed data banks, distributed data mining
Bio-informatics
Common Ground
• Large amounts of data• Distributed, ad-hoc user community• Problems are distributable
• Need for resources grows faster than market• Network grows faster than the application needs
• Willingness to share resources …• … if security and integrity is guaranteed
The One-Liner
• Resource sharing and coordinated problem solving in dynamic multi-institutional virtual organisations
What is Grid computing?
• Dependable, consistent and pervasive access• Combining resources from various organizations
• `Virtual Organizations’ – user-based view on Grid
• Technical challenges:– transparent decisions for the user– uniformity in access methods– secure & crack resistant– authentication, authorization, accounting (AAA) "a
• Globus Project started 1997• de facto-standard• Reference implementation of Gridforum standards
• Large community effort• Basis of several projects, including EU-DataGrid
• Toolkit `bag-of-services' approach
• Successful test beds, with single sign-on, etc…
Grid Middleware
In The Beginning
• Distributed Computing– synchronous processing
• High-Throughput Computing– asynchronous processing
• On-Demand Computing– dynamic resources
• Data-Intensive Computing– databases
• Collaborative Computing– science
Ian Foster and Carl Kesselman, editors, “The Grid: Blueprint for a New Computing Infrastructure,” Morgan Kaufmann, 1999
Grid Architecture
Applications
Grid Services GRAM
Grid Security Infrastructure (GSI)
Grid FabricCondor MPI PBS Internet Linux
Application ToolkitsDUROC MPICH-G2Condor-G
GridFTPMDS
SUN
VLAM-G
Make all resources talk standard protocols
Promote interoperability of application toolkit, similar to interoperability of networks by Internet standards
ReplicaSrv
OGSA: new directions
• Looks superficially like `web services’• Based on common standards:
– WSDL– SOAP– UDDI
• Adds:– Transient services– State of distributed activities– Workflow, videoconf, distributed data analysis
• Management of service instances• Grid Security Infrastructure
Looking for Resources
• Resource Brokerage based on matchmaking (Condor)
• Information Services Mesh– Meta-computing directory– Replica Catalogues
DataGrid http://marianne.in2p3.fr/
Submitting a Job
Locating a Replica
• Grid Data Mirror Package
• Moves data across sites• Replicates both files and
individual objects• Catalogue used by Broker• Replica Location Service
(giggle)
• Read-only copies “owner” by the Replica Manager.
http://cmsdoc.cern.ch/cms/grid
Sending Your Data
• Tape robots, disks, etc. share GridFTP interface• Supports single-sign-on and confidentiality• Optimize for high-speed >1Gbit/s networks
• In the future: automatic optimizations, bandwidth reservations, directory-enabled networking, …
Grid-enabled Databases?
• SpitFireuniform access to persistent storage on the Grid
• Multiple roles support• Compatible with GSI (single sign-on) though CoG• Uses standard stuff: JDBC, SOAP, XML• Supports various back-end data bases
http://hep-proj-spitfire.web.cern.ch/hep-proj-spitfire/
DataGrid Test Bed 1
• DataGrid TB1:– 14 countries– 21 major sites
– Growing rapidly
• Submitting Jobs:– Login only once,
run everywhere– Cross administrative
boundaries in asecure and trusted way
– Mutual authorization
DutchGrid Platform
Amsterdam
UtrechtKNMI
Delft
Leiden
Nijmegen
Enschede
• DutchGrid:– Test bed coordination– PKI security
• Participation byNIKHEF:
FOM, VU, UvA, Utrecht, Nijmegen
KNMI, SARA
AMOLF
DAS-2 (ASCI):TUDelft, Leiden, VU, UvA, Utrecht
Telematics Institute
And now for some Technical Details
For Users
Start using the grid
• All the necessary “client tools” are on all Linux and Solaris systems
• You just need:– Credentials/tokens for the Grid (see next slides)– Authorization to use resources
(you get all NIKHEF resources by default)– Information on which resources to use effectively
Your Grid Credentials
• You will use resources across several domains– You may not care about security and authorization– But the remote site admin will !
• All communications are authenticated usingX.509 “Public Key” Certificates
• The technology used to securecredit card transactions on the web (https://……)
• Uniquely binds name/affiliation to a digital token
Certification Authorities
• CA’s act as trusted third parties
• Remote sites trust the CA for a proper binding• They will not do authentication again, so
only authorization left.
• CA’s are highly valuable: crack one to impersonate others on the Grid
(and abuse resources)
• Registration Authorities do in-person ID checks
CA’s in DataGrid
• 10 National CA’s (one per EU country)• Each one has a detailed
policy and practice statement
• NIKHEF operates the CA for DutchGridSee http://www.dutchgrid.nl/ca
• Get a “certificate” from the DutchGrid CAbefore you can start using the Grid
• It’s valuable, protect it with a pass phrase• One cert valid for all DataGrid sites
The Proxy
• A `proxy certificate’ is a limited-lifetime delegationwithout a pass phrase to protect it
• Implements the single sign-on for Grid• Valid for 12 hours (by default)
• Use it to:– Run your jobs– Get access to your data
• Get it, by running grid-proxy-init
Now see for yourself
Getting a Certificate
• Initialize your environment for the Grid• Use the Globus local guide from
http://www.dutchgrid.nl/Support/
• Send the result to [email protected] will be contacted by phone
• Put the certificate (sent by mail) in your$HOME/.globus/usercert.pem
• Or use the Web at http://certificate.nikhef.nl/userhelp.html
Using the Grid
• Request authorization: [email protected]• Look what is out there using grid-info-search or
http://marianne.in2p3.fr/datagrid/giis/giis-browse.html
• Try some local hosts:– bilbo, kilogram, triangel
kilogram:davidg:1009$ globus-job-run dommel.wins.uva.nl /usr/ucb/quota -v
Disk quotas for random (uid 12xxx):
Filesystem usage quota limit timeleft files quota limit timeleft
/home/random 13067 1500000 2000000 0 0 0
kilogram:davidg:1010$
• Start running your analysis/MC/other jobs
GridFTP
• Universal high-performance file transfer• Extends the FTP protocol with:
– Single sign-on (GSI, GSSAPI, RFC2228)– Parallel streams for speed-up– Striped access (ftp from multiple sites to be faster)
• Clients: gsincftp, globus-url-copy.
What’s Next?
• Some of the nice user-features to come:
– Finding data files by characteristics(give me all golden decay’s)
– Moving your job to where the data is– Automatic partitioning of jobs– Support true-interactive work– Better network utilisation (faster access to data)– ………
• If you are in the DataGrid project, ask your WP leader for authorization in TB1