Implementing Unified Access to Scientific Data from.NET Platform Sergey B. Berezin Dmitriy V....

21
Implementing Unified Access to Scientific Data from .NET Platform Sergey B. Berezin Dmitriy V. Voitsekhovskiy Vilen M. Paskonov Moscow State University Department of Computational Mathematics and Cybernetics Supported by Student Laboratory of Microsoft Technologies and RFBR grants

Transcript of Implementing Unified Access to Scientific Data from.NET Platform Sergey B. Berezin Dmitriy V....

Page 1: Implementing Unified Access to Scientific Data from.NET Platform Sergey B. Berezin Dmitriy V. Voitsekhovskiy Vilen M. Paskonov Moscow State University.

Implementing Unified Access to Scientific Datafrom .NET Platform

Sergey B. Berezin

Dmitriy V. Voitsekhovskiy

Vilen M. Paskonov

Moscow State University

Department of Computational Mathematics and Cybernetics

Supported by Student Laboratory of Microsoft Technologies and RFBR grants

Page 2: Implementing Unified Access to Scientific Data from.NET Platform Sergey B. Berezin Dmitriy V. Voitsekhovskiy Vilen M. Paskonov Moscow State University.

Different languages, common tools

Viscous fluid flow visualization via vector fields and color maps

(http://www.cs.msu.su)Seismic data visualization via

isosurfaces (http://www.sci.utah.edu)

Tensor field visualization for diffusion through biological tissue (http://www.sci.utah.edu)

Page 3: Implementing Unified Access to Scientific Data from.NET Platform Sergey B. Berezin Dmitriy V. Voitsekhovskiy Vilen M. Paskonov Moscow State University.

Scientific data access requirements

Viscous fluid flow visualization via vector fields and color maps (

http://www.cs.msu.su)Seismic data visualization

via isosurfaces (http://www.sci.utah.edu)

Tensor field visualization for diffusion through biological tissue (http://www.sci.utah.edu)

We need to:

Retrieve typed data object without regard where it is stored and how it is stored.

• Physical data independence

Retrieve partial data when needed

• Filtering & Caching

Retrieve data description

• Metadata support

We don’t want to:

Rewrite existing computational software

• Use existing formats

Install new system software

• Use existing protocols

Page 4: Implementing Unified Access to Scientific Data from.NET Platform Sergey B. Berezin Dmitriy V. Voitsekhovskiy Vilen M. Paskonov Moscow State University.

Scientific data access today

Page 5: Implementing Unified Access to Scientific Data from.NET Platform Sergey B. Berezin Dmitriy V. Voitsekhovskiy Vilen M. Paskonov Moscow State University.

What’s so special in scientific data?Scientific data … Have a complex structure; Parameterized by …

Time Sampling point coordinates More complex parameters;

Stored in many files of various formats Have very large size of individual data items

Don’t fit well to relational model!

Page 6: Implementing Unified Access to Scientific Data from.NET Platform Sergey B. Berezin Dmitriy V. Voitsekhovskiy Vilen M. Paskonov Moscow State University.

What is DataSet?

DataSet

Part of file

Files

File

Data Base Data Base

Metadata

1. Human-readable descriptions and properties.

2. Machine-oriented information about data.

Page 7: Implementing Unified Access to Scientific Data from.NET Platform Sergey B. Berezin Dmitriy V. Voitsekhovskiy Vilen M. Paskonov Moscow State University.

Example: Accessing data in C#// Retrieve the DataSet object from a server by GUIDDataSet dataset = DataSet.Open("http://regatta.cs.msu.su:9111",

"767c57b1-801e-4784-bbd6-287707fd0ec2");

// Fetching DataItem by name.// DataItem may be either simple or composite, it doesn’t matterDataItem xVelocity = dataset.DataItems["u-values"];

// Creating parameter corresponding to time moment = 0.0CompositeParameter param = new CompositeParameter(

new ParameterValue("time", 0.0d) );

// Fetching DataItemSlice for the parameter.// It is an instance of DataItem for specified parameter value.DataItemSlice dataVelocity = xVelocity[param];

// Getting required data: velocity array for time = 0.0ScalarArray3d data = dataVelocity.GetData() as ScalarArray3d;

Page 8: Implementing Unified Access to Scientific Data from.NET Platform Sergey B. Berezin Dmitriy V. Voitsekhovskiy Vilen M. Paskonov Moscow State University.

DataRequest: communicating with server The following DataRequest is sent to a server as the result of the previous example:

<soap:Envelope … ><soap:Body> <dataRequest dataSource="…" dataSet="767c57b1-801e-4784-bbd6-287707fd0ec2" … > <dataItem type=“ScalarArray3d"> <dataSource sourceName="u0000.cdf" sourceType="netCDF" sourceParameters="u" /> </dataItem> </dataRequest></soap:Body></soap:Envelope>

The following DataRequest is received from the server:

<soap:Envelope … ><soap:Body> <dataRequest dataSource="…" dataSet="767c57b1-801e-4784-bbd6-287707fd0ec2" … > <dataItem type=“ScalarArray3d"> <dataRef sourceType=“u0000.cdf" sourceParameters=“u"> <remote url="scp://regatta.cs.msu.su/datasource/pvm/Re1000/u0000.cdf“ /> </dataRef> </dataItem></dataRequest></soap:Body></soap:Envelope>

Page 9: Implementing Unified Access to Scientific Data from.NET Platform Sergey B. Berezin Dmitriy V. Voitsekhovskiy Vilen M. Paskonov Moscow State University.

Complex structures in DataSet

file with scalar array

file with scalar array

file with scalar array

file with spatial grid

scalar arraydata item

scalar arraydata item

scalar arraydata item

vector arrayconstructor

vector arraydata item

spatial griddata item

data fieldconstructor

vector fielddata item

x

X,Y,Z

Page 10: Implementing Unified Access to Scientific Data from.NET Platform Sergey B. Berezin Dmitriy V. Voitsekhovskiy Vilen M. Paskonov Moscow State University.

Example: Accessing composite data in C#// Retrieve the DataSet object from a server by GUIDDataSet dataset = DataSet.Open("http://regatta.cs.msu.su:9111",

"767c57b1-801e-4784-bbd6-287707fd0ec2");

// Fetching DataItem by name.// DataItem may be either simple or composite, it doesn’t matterDataItem velocity = dataset.DataItems["uvw-values"];

// Creating parameter corresponding to time moment = 0.0CompositeParameter param = new CompositeParameter(

new ParameterValue("time", 0.0d) );

// Fetching DataItemSlice for the parameter.// It is an instance of DataItem for specified parameter value.DataItemSlice dataVelocity = velocity[param];

// Getting required data: velocity array for time = 0.0Vector3dArray3d data = dataVelocity.GetData() as Vector3dArray3d;

Page 11: Implementing Unified Access to Scientific Data from.NET Platform Sergey B. Berezin Dmitriy V. Voitsekhovskiy Vilen M. Paskonov Moscow State University.

DataRequest: composite data items

The following DataRequest is sent to a server as the result of the previous example execution:

<soap:Envelope … ><soap:Body> <dataRequest dataSource="…" dataSet="767c57b1-801e-4784-bbd6-287707fd0ec2" … > <dataItem type="Vector3dArray3d"> <!–- This is a COMPOSITE DATAITEM! --> <composite constructor="CompositeVectorArray"> <component> <!-- u-values --> <dataItem type="ScalarArray3d"> <dataSource sourceName="u0000.cdf" sourceType="netCDF" sourceParameters="u" /> </dataItem> </component> <component> … </component> <!-- v --> <component> … </component> <!-- w --> </composite></dataItem></dataRequest></soap:Body></soap:Envelope>

The following DataRequest is received from server:

<soap:Envelope … ><soap:Body> <dataRequest dataSource="…" dataSet="767c57b1-801e-4784-bbd6-287707fd0ec2" … > <dataItem type=“Vector3dArray3d"> <dataRef sourceType="plain binary" sourceParameters=""> <remote url="scp://regatta.cs.msu.su/datasource/~Fdg4gBd“ /> </dataRef> </dataItem></dataRequest></soap:Body></soap:Envelope>

Page 12: Implementing Unified Access to Scientific Data from.NET Platform Sergey B. Berezin Dmitriy V. Voitsekhovskiy Vilen M. Paskonov Moscow State University.

Filtering Filtering allows transfer of only required data from server to

client Filtering may be performed both by a client-side and a server-

side of the system. Examples of the filtering are cropping and thinning of large

vector fields.

0 1

1

3000 x 30002d vectors

108 МB

0.4 0.76

0.761000 x 10002d vectors

12 MB0.4

0.4 0.76

0.76100 x 1002d vectors

120KB0.4

croppingfilter

[0.4,0.76] x [0.4,0.76]

thinningfilter

(0.1,0.1)

Page 13: Implementing Unified Access to Scientific Data from.NET Platform Sergey B. Berezin Dmitriy V. Voitsekhovskiy Vilen M. Paskonov Moscow State University.

Example: Filtering data in C#// Initializing the DataSet object from a server by its GUID

DataSet dataset = DataSet.Open("http://regatta.cs.msu.su:9111", "767c57b1-801e-4784-bbd6-287707fd0ec2");

// Fetching DataItem by its name. It may be either simple or compositeDataItem velocity = dataset.DataItems["uvw-values"];// Creating parameter corresponding to time moment = 0.0CompositeParameter param = new CompositeParameter(

new ParameterValue("time", 0.0d) );// Fetching DataItemSlice for the parameter.DataItemSlice dataVelocity = velocity[param];

// Creating filter "Thinner" for a type of the velocity data item// and setting up its parametersIThinner3dFilter filter = FilterFactory.GetFilter("Thinner",

dataVelocity.TypeDescriptor) as IThinner3dFilter;filter.PercentageX = 0.05;filter.PercentageY = 0.05;filter.PercentageZ = 0.05;

// Getting required data: thinned out velocity array for time = 0.0Vector3dArray3d data = dataVelocity.GetData(filter) as Vector3dArray3d;

Page 14: Implementing Unified Access to Scientific Data from.NET Platform Sergey B. Berezin Dmitriy V. Voitsekhovskiy Vilen M. Paskonov Moscow State University.

DataRequest: communicating with server The following DataRequest is sent to a server as the result of the previous example execution:

<soap:Envelope … ><soap:Body> <dataRequest dataSource="…" dataSet="767c57b1-801e-4784-bbd6-287707fd0ec2" … > <filter name="Thinner"> <parameters> <parameter name=“PercentageX” value=“0.05” type=“double” /> <parameter name=“PercentageY” value=“0.05” type=“double” /> <parameter name=“PercentageZ” value=“0.05” type=“double” /> </parameters> <dataItem type="Vector3dArray3d"> <!–- This is a COMPOSITE DATAITEM! --> <composite constructor="CompositeVectorArray"> <component> <!-- u-values --> <dataItem type="ScalarArray3d"> <dataSource sourceName="u0000.cdf" sourceType="netCDF" sourceParameters="u" /> </dataItem> </component> <component> … </component> <!-- v --> <component> … </component> <!-- w --> </composite></dataItem> </filter> </dataRequest> </soap:Body></soap:Envelope>

DataRequest

DataRequest

Filter

Filter

DataItemFiltering(may be multiple)

DataItemwith dataRefs

The returned DataRequest is similar in this case to the returned DataRequest from the previous example.

Page 15: Implementing Unified Access to Scientific Data from.NET Platform Sergey B. Berezin Dmitriy V. Voitsekhovskiy Vilen M. Paskonov Moscow State University.

Caching Both server-side and client-side of the system

cache the results of a successful DataRequest execution.

Server-side cache filtering results Client-side cache retrieved data items and

results of DataRequest

Page 16: Implementing Unified Access to Scientific Data from.NET Platform Sergey B. Berezin Dmitriy V. Voitsekhovskiy Vilen M. Paskonov Moscow State University.

How DataRequest is performed? Data Provider

DataItemSlice

DataRequestData Services

Internet

Typed data object

Typed data object

Data Explorer

Cache Service

Server running DataSource System

SendingDataRequest

to DRSRemote Data

Loader

Data Stream

DataRequest: only dataRefs

Parser

DataRequest: only local dataRefs

Local Filtering

DR + Typed data object

Working with cache

Working with cache

File Transfer Server

Page 17: Implementing Unified Access to Scientific Data from.NET Platform Sergey B. Berezin Dmitriy V. Voitsekhovskiy Vilen M. Paskonov Moscow State University.

Deployment Scenario The simplest scenario is as follows:

Server

DataSet storage IBM Regatta

DataRequest Web Service

File Transfer Services

Client on the .NET platform

Unified Data Access System

Visualization System

Workstation

DataRequest

DataItem

More sophisticated scenario includes development of distributed data sources that provide scientific data.

Dedicated servers will act as data processor performing data filtering and transformations

Dedicated servers will act as data registries allowing DataSet enumeration and querying in entire global network.

This will make possible to create dynamic data libraries of researches and enables easy data publishing

Page 18: Implementing Unified Access to Scientific Data from.NET Platform Sergey B. Berezin Dmitriy V. Voitsekhovskiy Vilen M. Paskonov Moscow State University.

Why .NET?

Object-oriented data access requires an object-oriented platform to be built on.

High extensibility is based on CLR dynamic nature New data types New filters New parsers No built in data types, filters, parsers

.NET opens new horizons with LINQ, WPF,…

Page 19: Implementing Unified Access to Scientific Data from.NET Platform Sergey B. Berezin Dmitriy V. Voitsekhovskiy Vilen M. Paskonov Moscow State University.

Future work Transferring files from remote server is just an

example of DataProvider Extend architecture for new types of data providers

LINQ technology will make data access from C# much more elegant.

Development of easy-to-use data management applications for the proposed approach.

Development of an innovative visualization system, highly extensible and customizable Or integrate our approach with existing one

Page 20: Implementing Unified Access to Scientific Data from.NET Platform Sergey B. Berezin Dmitriy V. Voitsekhovskiy Vilen M. Paskonov Moscow State University.

Future visualization system

Control A

Control B

Control C

A

BA

CView 1

View 1.1

Control D

D

View 1.2View 1.1.1

Step 1. Choose object of interest

Step 2. Choose data transform

Step 3. Choose visualization algorithm

Example

Page 21: Implementing Unified Access to Scientific Data from.NET Platform Sergey B. Berezin Dmitriy V. Voitsekhovskiy Vilen M. Paskonov Moscow State University.

Questions? Visit: http://microsoft.cs.msu.su/projects/uvs Mail to: [email protected]