INFSO-RI-508833
Enabling Grids for E-sciencE
www.eu-egee.org
OGSA DAIData Access and Integration
Marek Ciglan
Institute of Informatics, Slovac Academy of Sciences
Grid Application Development, Bratislava, 10.03.05 2
Enabling Grids for E-sciencE
INFSO-RI-508833
Motivation• Different users / applications store data in different
formats– Plain files
– XML databases
– Relational Databases PostgreSQL Oracle DB2 MySql
• Difficult to work with a lot of different data formats
• Difficult to integrate data from heterogeneous resources
Grid Application Development, Bratislava, 10.03.05 3
Enabling Grids for E-sciencE
INFSO-RI-508833
OGSA DAI - Overview• Allow different types of data models
– Files
– XML databases
– Relational Databases
• Allow data to be accessed through uniform interfaces
• Provide extensible framework for integrating data resources on the Grids
• Allow metadata about data and the data resources in which they are stored to be obtained
• Facilitate the integration of data from various sources to obtain the required information
Grid Application Development, Bratislava, 10.03.05 4
Enabling Grids for E-sciencE
INFSO-RI-508833
Architecture
Grid Application Development, Bratislava, 10.03.05 5
Enabling Grids for E-sciencE
INFSO-RI-508833
Data Resource Activities• Relational Activities
– Run an SQL query statement
– Run an SQL update statement
– …
• XML Activities – Run an XPath statement against an XML database
– Run an XUpdate statement against an XML database
– …
• File Activities – Access a directory
– Read data from a file
– Manipulate files in a directory
– Write data into a file
Grid Application Development, Bratislava, 10.03.05 6
Enabling Grids for E-sciencE
INFSO-RI-508833
Delivery Activities • Retrieve data from a URL
• Deliver data to a URL
• Deliver data to a GridFTP server
• Retrieve data from a GridFTP server
• Deliver results to a stream
• …
Grid Application Development, Bratislava, 10.03.05 7
Enabling Grids for E-sciencE
INFSO-RI-508833
Transformation Activities • ZIP compress the results
• GNU-ZIP compress the results
• GNU-ZIP decompress results
• Transform data using an XSLT
• Break a single block into multiple blocks based on a set of separator characters
• Aggregate multiple blocks into a single block
Grid Application Development, Bratislava, 10.03.05 8
Enabling Grids for E-sciencE
INFSO-RI-508833
Data integration
MySql XML database PostgreSQL Text File
Oracle Data Warehouse
Grid Application Development, Bratislava, 10.03.05 9
Enabling Grids for E-sciencE
INFSO-RI-508833
Data integration
MySql XML database PostgreSQL Text File
Oracle Data Warehouse
How to integrate all those heterogeneous data into central data
warehouse ?
Grid Application Development, Bratislava, 10.03.05 10
Enabling Grids for E-sciencE
INFSO-RI-508833
Data integration
MySql XML database PostgreSQL Text File
Oracle Data Warehouse
OGSA - DAI
Grid Application Development, Bratislava, 10.03.05 11
Enabling Grids for E-sciencE
INFSO-RI-508833
Data integration
MySql XML database PostgreSQL Text File
Oracle Data Warehouse
OGSA - DAI
Select data
Write data into file
Compress file
Transfer zip file
Grid Application Development, Bratislava, 10.03.05 12
Enabling Grids for E-sciencE
INFSO-RI-508833
Data integration
MySql XML database PostgreSQL Text File
Oracle Data Warehouse
OGSA - DAI
Select data
Write data into file
Compress file
Transfer zip file
Read subset of file
Transform
Compress file
Transfer zip file
Grid Application Development, Bratislava, 10.03.05 13
Enabling Grids for E-sciencE
INFSO-RI-508833
Data integration
MySql XML database PostgreSQL Text File
Oracle Data Warehouse
OGSA - DAI
Select data
Write data into file
Compress file
Transfer zip file
Read subset of file
XLST Transform
Compress file
Transfer zip file
Select data
Write data into file
Compress file
Transfer zip file
Read subset of file
Transform
Compress file
Transfer zip file
Grid Application Development, Bratislava, 10.03.05 14
Enabling Grids for E-sciencE
INFSO-RI-508833
Data integration• How to perform data integration ?
– Write specialized Java application for data integration
– Use OGSA-DAI perform documents
• Perform Documents– XML documents
– Describe activities to be performed
<sqlQueryStatement name="myQuery">
<expression>
select * from littleblackbook where id=10
</expression>
<webRowSetStream name="myQueryOutput"/>
</sqlQueryStatement>
Grid Application Development, Bratislava, 10.03.05 15
Enabling Grids for E-sciencE
INFSO-RI-508833
Perform documents• Activities integration with perform documents
<sqlQueryStatement name="myQuery">
<expression>
select * from littleblackbook where id<100
</expression>
<webRowSetStream name="myQueryOutput"/>
</sqlQueryStatement>
<deliverToGDT name="deliverQueryResults">
<fromLocal from="myQueryOutput"/>
<toGDT streamId="otherServiceInput" mode="full"> http://localhost:8080/ogsa/services/ogsadai/SomeDAIService
</toGDT>
</deliverToGDT>
Grid Application Development, Bratislava, 10.03.05 16
Enabling Grids for E-sciencE
INFSO-RI-508833
Data Security• Role mapping is the process of authorizing a client's
request to access a data resource
• two-step process: – Check whether the client is allowed to access the data resource
– Determine the database user name and password (or role) to be used for this client
• A role map document contains the information required to undertake this process
Grid Application Development, Bratislava, 10.03.05 17
Enabling Grids for E-sciencE
INFSO-RI-508833
Data Security• Simple OGSA-DAI Role Map Documents
<DatabaseRoles>
<Database name="jdbc:mysql://host:6502/otherData">
<User dn="No Certificate Provided"
userid="myUser" password="123"/>
<User dn="/C=UK/O=eScience/OU=Aspatria/L=AeSC/CN=tom“
userid="superUser" password="myPassword"/>
</Database>
</DatabaseRoles>
Grid Application Development, Bratislava, 10.03.05 18
Enabling Grids for E-sciencE
INFSO-RI-508833
The End Thank you for your attention.
Top Related