Adrian Jackson, Stephen Booth EPCC [email protected] +44 131 650 5746 Resource Usage Monitoring...
-
Upload
job-harmon -
Category
Documents
-
view
218 -
download
0
Transcript of Adrian Jackson, Stephen Booth EPCC [email protected] +44 131 650 5746 Resource Usage Monitoring...
![Page 1: Adrian Jackson, Stephen Booth EPCC s.booth@epcc.ed.ac.uk +44 131 650 5746 Resource Usage Monitoring and Accounting.](https://reader035.fdocuments.us/reader035/viewer/2022062409/5697c0221a28abf838cd367d/html5/thumbnails/1.jpg)
Adrian Jackson, Stephen BoothEPCC
[email protected]+44 131 650 5746
Resource Usage Monitoring and
Accounting
![Page 2: Adrian Jackson, Stephen Booth EPCC s.booth@epcc.ed.ac.uk +44 131 650 5746 Resource Usage Monitoring and Accounting.](https://reader035.fdocuments.us/reader035/viewer/2022062409/5697c0221a28abf838cd367d/html5/thumbnails/2.jpg)
GridSafe AHM 2009 2
Introduction
• Resource usage accounting has long been standard practice
on high-end compute resources.
• Historically less common on smaller systems where it was
easier to apportion costs locally.– This is becoming less viable.
– FEC costing– Grid computing (users no longer local) – Virtualisation
![Page 3: Adrian Jackson, Stephen Booth EPCC s.booth@epcc.ed.ac.uk +44 131 650 5746 Resource Usage Monitoring and Accounting.](https://reader035.fdocuments.us/reader035/viewer/2022062409/5697c0221a28abf838cd367d/html5/thumbnails/3.jpg)
GridSafe AHM 2009 3
GridSAFE
• JISC funded project to build general purpose accounting/monitoring solution.– http://gridsafe.forge.nesc.ac.uk/– Builds on accounting subsystem from SAFE user administration system used
by HPCx/HECToR
• Challenges:– Need to work with wide variety of different local policies.– Need to work with both grids and local HPC resources.
• One solution won’t fit all potential users– Build kit of parts – Pre-built solutions for common deployment scenarios.
• Key aims– Modular design, individual functions can be deployed independently – Behaviour can be customised using plug-ins to implement different service
policies.
![Page 4: Adrian Jackson, Stephen Booth EPCC s.booth@epcc.ed.ac.uk +44 131 650 5746 Resource Usage Monitoring and Accounting.](https://reader035.fdocuments.us/reader035/viewer/2022062409/5697c0221a28abf838cd367d/html5/thumbnails/4.jpg)
GridSafe AHM 2009 4
End Users
• End users are interested in accounting for their own use.– Compare the efficiency of different systems– Compare the cost effectiveness of different systems.– Check resources available
• Often interested in individual jobs as well as overall totals.
![Page 5: Adrian Jackson, Stephen Booth EPCC s.booth@epcc.ed.ac.uk +44 131 650 5746 Resource Usage Monitoring and Accounting.](https://reader035.fdocuments.us/reader035/viewer/2022062409/5697c0221a28abf838cd367d/html5/thumbnails/5.jpg)
GridSafe AHM 2009 5
Resource Providers
• Need to gather the raw
accounting data.– Format depends on the
underlying technology.
• Need to apply local policies– Charges
– Discounts
– Where to charge
• Usage data may be useful for
purposes other than accounting.– Analysing queue wait times.
– Job size profiles.
– May want to keep some of this data private.
![Page 6: Adrian Jackson, Stephen Booth EPCC s.booth@epcc.ed.ac.uk +44 131 650 5746 Resource Usage Monitoring and Accounting.](https://reader035.fdocuments.us/reader035/viewer/2022062409/5697c0221a28abf838cd367d/html5/thumbnails/6.jpg)
GridSafe AHM 2009 6
Research groups/Virtual organisations
• Research groups/VOs need to manage their resources
across all available platforms.– Ideally have all information available in a single place.
• Where all resources reside within a single grid this can be
provided by grid-level accounting.
• Resources may come from multiple grids or independent
resource/ providers.
![Page 8: Adrian Jackson, Stephen Booth EPCC s.booth@epcc.ed.ac.uk +44 131 650 5746 Resource Usage Monitoring and Accounting.](https://reader035.fdocuments.us/reader035/viewer/2022062409/5697c0221a28abf838cd367d/html5/thumbnails/8.jpg)
GridSafe AHM 2009 8
Grid-SAFE core
• Java code with data stored in MySQL database.– Normally run within a tomcat container.
• UsageRecords are treated as a collection of properties
• Highly customisable– Code does not mandate a single format– Can choose which of the available properties to store in database.– Can add new properties for site local concepts– Easily extendable to new types of data
– Storage accounting– Allocation tracking
![Page 9: Adrian Jackson, Stephen Booth EPCC s.booth@epcc.ed.ac.uk +44 131 650 5746 Resource Usage Monitoring and Accounting.](https://reader035.fdocuments.us/reader035/viewer/2022062409/5697c0221a28abf838cd367d/html5/thumbnails/9.jpg)
GridSafe AHM 2009 9
Accounting code
• Plug-in parser modules handle different types of input data.– OGF-UR– SGE– PBS– EGEE JobManager– Etc.
• Plug-in policy modules augment these allowing site local customisation
![Page 10: Adrian Jackson, Stephen Booth EPCC s.booth@epcc.ed.ac.uk +44 131 650 5746 Resource Usage Monitoring and Accounting.](https://reader035.fdocuments.us/reader035/viewer/2022062409/5697c0221a28abf838cd367d/html5/thumbnails/10.jpg)
GridSafe AHM 2009 10
Reporting Portal
• Grid-safe uses XML templates to define reports – Can generate unified reports over multiple data tables containing
different types of data
– Tables/charts
– Parameterised reports (e.g. to select user or project).
• Support reports in multiple formats– PDF HTML CSV
• Performance of report generation a particular issue– Utilise database effectively.
– Use aggregate tables for high throughput systems.
![Page 12: Adrian Jackson, Stephen Booth EPCC s.booth@epcc.ed.ac.uk +44 131 650 5746 Resource Usage Monitoring and Accounting.](https://reader035.fdocuments.us/reader035/viewer/2022062409/5697c0221a28abf838cd367d/html5/thumbnails/12.jpg)
GridSafe AHM 2009 12
Web Services
• Web service interface for access by other services.
• Web service interfaces use OGF-UR XML as common
interchange format.
• RUPI – Resource Usage Publishing Interface– Interface for uploading usage records to a remote repository.– Currently a OGF-RUS-WG proposal
• RUQI – Resource Usage Query Interface– Interface for running queries on a remote repository.– Aim to submit to OGF-RUS-WG
![Page 13: Adrian Jackson, Stephen Booth EPCC s.booth@epcc.ed.ac.uk +44 131 650 5746 Resource Usage Monitoring and Accounting.](https://reader035.fdocuments.us/reader035/viewer/2022062409/5697c0221a28abf838cd367d/html5/thumbnails/13.jpg)
GridSafe AHM 2009 13
Grid level accounting
• Grid accounting is not a solved problem– We are aiming to contribute useful technology not to dictate a solution.
• Different grids are pursuing different architectures– EGEE/NGS hierarchical model
– Data published up tree of repositories
– DEISA distributed model.
– Resource providers run local repositories and control access to data.
– Accounting operations query multiple repositories.
• Some commonality– OGF-UR format generally accepted as common data interchange format.
• Combination of RUPI/RUQI can be used to implement either model.
![Page 14: Adrian Jackson, Stephen Booth EPCC s.booth@epcc.ed.ac.uk +44 131 650 5746 Resource Usage Monitoring and Accounting.](https://reader035.fdocuments.us/reader035/viewer/2022062409/5697c0221a28abf838cd367d/html5/thumbnails/14.jpg)
GridSafe AHM 2009 14
• Actively looking for sites to use the software
• Don’t need to use everything
• http://gridsafe.forge.nesc.ac.uk/