Database Access and Integration Services on the Grid *
description
Transcript of Database Access and Integration Services on the Grid *
DAIS Grid 1
Database Access and Integration Services on the Grid*
* http://www.cs.man.ac.uk/grid-db/papers/dbtf.pdf
Authors: N. Paton, M. Atkinson, V. Dialiani, D. Pearson, T. Storey, P. Watson
Florida International University
School of Computing and Information Sciences
Summer 2006
Presented by: Ariel Cary
DAIS Grid 2
Agenda Introduction
Scope and Context of Proposal
Proposed Database Services
DS in OGSA
Current DAIS Standards and Systems
Conclusion
DAIS Grid 3
Introduction
• Grid research generally focus on applications where data is stored in files
• DBMS systems have a central role in data organization for numerous applications, e-Science: particle physics (LHC@CERN), earth sciences, bio-informatics
• There is a need to interconnect pre-existing and independently operated databases
DAIS Grid 4
Introduction (cont)
This work seeks to encourage the development of standards that can meet those needs.
A (preliminary) proposal is made for the staged development of a collection of Grid Database Services that allow access to existing, autonomous databases within Grid
Follows a service-based approach within OGSA framework for DBMS integration
DAIS Grid 5
Introduction (cont)
How functionalities are supported may come to be implemented in different ways (performance characteristics, etc.)
Services definitions essentially state what functionality is to be supported
DAIS Grid 6
Scope and Context of Proposal
DAIS Grid 7
Scope
The proposal has several characteristics
– Independent of any specific Grid toolkit (could skew and restrict it)
– It does not propose the development of a new DBMS for the Grid, but wrapping existing systems to a consistent interface and developing distributed managers
– Independent of any specific data model or access language
DAIS Grid 8
Context
Relevant terms related to Databases
– Database Service is any service that supports a database interface (WSDL)
– Service interfaces are abstract and not prescriptive on how they are supported, or the data model that underpins a DBMS
– Specific DBMS services could provide access to relational or object DBMS, XML repositories, specialist storage systems …
DAIS Grid 9
Context– Grid Database Service (GDS) provides
capabilities for querying, updating and evolving a database
– The interface also describes:Data delivery: transmitting structured dataTransactions: coordinating collections of
operationsDatabase Metadata: accessing information
about the data a DB service provides
DAIS Grid 10
Proposed Database Services
DAIS Grid 11
Database Discovery
It is assumed that a registry lookup returns a Grid Service Handle (GSH), globally unique name for a service instance
A service provider publishes description (WSDL) of a service to a service registry
Later consulted by a requestor, and binding created that allow calls to the service
DAIS Grid 12
Database Statements
Thus, it is a point of tension with the proposal being independent of the data model
Statements allow queries or change operations to be sent to a DBMS
This implies that the underlying DBMS supports a query or command language, different on every database model
DAIS Grid 13
Database Statements (cont.)
The pairs (queryNotation, query), … are introduced to allow flexibility (like MIME types for e-mail attachments)
For example:– queryNotation=“SQL’92”– query=“Select * from EMP Where Salary>1000”
DAIS Grid 14
Database Statements (cont.)
The optional txHandle indicates if the operation is part of a transaction, provided the DBMS supports transactions
The final results of an operation are managed via:
– resultHandle: generated dynamically– expires: an expiry time up for the result to be
claimed
DAIS Grid 15
Database Statements (cont.) The operations on a GDS will be atomic:
– Preparation and Validation: consistency check– Application: operation is performed– Result Delivery: results available to the caller
Usually involve transfer of large amounts of data which may take long time to execute (prone to interruptions!)
The implementation of the DBMS service should handle such failures to achieve atomicity
DAIS Grid 16
Delivery System
Means by which (potentially large amounts of) structured data is moved from one locations to one or more others
Should be considered complementary to protocols such as GridFTP, which could be used as a delivery mechanism
DAIS Grid 17
Delivery System (cont.)
Single data source to be delivered, represented as a URI
Several destinations represented by URI with delivery mechanisms associated
The deliver operation initiates delivery of the data from the single source to multiple destinations
A more elaborated delivery system would include encryption, progress monitoring, etc.
DAIS Grid 18
Distributed Transactions
A minimal transaction interface: performs the role of conferring a guaranteed unique identity on the transaction
Given a transaction handle, other operations over a database service can be put explicitly within the context of a transaction, using the txHandle parameter
DAIS Grid 19
Distributed Transactions (cont.)
For a transaction to span multiple DBMS services, they must provide operations for use by the transaction manager that is overseeing the distributed transaction
startTransaction includes an expires param. to limit the consumption of resources
prepareCommit operation can be used by a two-phase commit protocol to ensure that all participating database services commit
DAIS Grid 20
Database Metadata
Metadata that could be useful to have access to includes:
– Content description: DB schema – data model, logical & physical structures, stats (could be obtained from the data dictionary)
– Capability description: language (query /update operations supported), transactional capabilities, protocols supported
The metadata should be described in a standard representation, e.g. XML document given by the data service provider
DAIS Grid 21
Distributed Query Service Query DS1
(DQS) Parsed &
optimized Sub-queries to
relevant DB’s Results
collected & joined by DQS
DAIS Grid 22
Database Services in OGSA
DAIS Grid 23
DS in OGSA
The Open Grid Services Architecture (OGSA) represents an evolution towards a Grid system architecture based on Web services concepts and technologies*
* http://www.globus.org/ogsa
The described interfaces can be used as the basis of database services through participation in the OGSA
Thus many features of this architectural framework can be obtained for service creation, authorization, notification, etc.
DAIS Grid 24
Requirements from OGSA
The secure connection and authentication mechanism underpins all GDS security and authentication
The lifetime management model carries over unchanged as the lifetime management model for GDS
The notification mechanism specified in OGSA appears to satisfy the GDS needs
DAIS Grid 25
Requirements from OGSA (cont.) It is required information about the user
authorization (potentially through many intermediate grid services)
– User identification services, referenced from a certificate
Certification of the services themselves may be necessary. A discovery service could be tricked to mimic the intended GDS and get the data sent
Some databases charge for their use. It is necessary to support a digital payment process
DAIS Grid 26
Current DAIS Standards
and Systems
DAIS Grid 27
DAIS Standards
Global Grid Forum – “The Global Grid Forum (GGF) is the community
of users, developers, and vendors leading the global standardization effort for grid computing.” http://www.ggf.org/
Part of the GGF: DAIS-WG– “The group seeks to promote standards for the
development of grid database services, focusing principally on providing consistent access to existing, autonomously managed databases.” https://forge.gridforum.org/projects/dais-wg
DAIS Grid 28
OGSA-DAI System
OGSA-DAI Overview http://www.ggf.org/GGF17/materials/303/Overview.ppt
Architecture + Extensibility http://www.ggf.org/GGF17/materials/303/GGF17ArchitectureExtensibility.ppt
Supported Data Resources http://www.ggf.org/GGF17/materials/303/GGF17ArchitectureExtensibility.ppt
“The aim of the OGSA-DAI project is to develop middleware to assist with access and integration of data from separate sources via the grid…and is working closely with the Global Grid Forum DAIS-
WG...” http://www.ogsadai.org/
DAIS Grid 29
Conclusion
DAIS Grid 30
Conclusion
This document has made a preliminary, service-oriented proposal for integrating database functionality into a Grid setting
It is hoped that the document will provoke discussion on how best databases can be integrated with Grid middleware
There is an establish community dedicated to defining DBMS service standards, and emerging system are adopting them