Facilitating Data Integration: A Data Conversion Function Service

22
Facilitating Data Integration: A Data Conversion Function Service Mario Martínez Gómez Supervisor: Mariano Cilia Prof. Alejandro P. Buchmann

description

Facilitating Data Integration: A Data Conversion Function Service. Mario Martínez Gómez Supervisor: Mariano Cilia Prof. Alejandro P. Buchmann. Facilitating Data Integration: A Conversion Function Manager Service. 1. Contents. - PowerPoint PPT Presentation

Transcript of Facilitating Data Integration: A Data Conversion Function Service

Page 1: Facilitating Data Integration:  A Data Conversion Function Service

Facilitating Data Integration: A Data Conversion Function Service

Mario Martínez GómezSupervisor: Mariano Cilia

Prof. Alejandro P. Buchmann

Page 2: Facilitating Data Integration:  A Data Conversion Function Service

Contents Motivation Language Infrastructure Conclusions & Future Work

● Motivation ● Proposed Approach

– A Conversion Function Definition Language– The Conversion Manager Service

● Conclusion & Future Work

Contents

Facilitating Data Integration: A Conversion Function Manager Service

1

Page 3: Facilitating Data Integration:  A Data Conversion Function Service

Motivation (I)● Current Information Revolution. Everything

is information-oriented.● Computers & Applications collaborate with

each other by exchanging data.● Traditional User’s role is assumed by Comps.

& Apps.● “human beings” can derive some data

context– Computers cannot!

Contents Motivation Language Infrastructure Conclusions & Future Work

Facilitating Data Integration: A Conversion Function Manager Service

2

Page 4: Facilitating Data Integration:  A Data Conversion Function Service

Motivation (II)● Data Exchange Common Vocabulary

required● But a common vocabulary is not sufficient

– Applications working at different locations may assume different Contexts about data

data conversion (Conversion Functions)

Contents Motivation Language Infrastructure Conclusions & Future Work

Facilitating Data Integration: A Conversion Function Manager Service

3

Page 5: Facilitating Data Integration:  A Data Conversion Function Service

Many Vocabularies. Many Contexts

<Price, 100, “€”>

PricePerUnit

Vocabulary of IBERIA

Context: EUROPE

Context: USA

Vocabulary of Lufthansa

Vocabulary of KLM

<Price,

100, “€”>

<PricePerUnit, 100, “€”>

Price

Contents Motivation Language Infrastructure Conclusions & Future Work

Facilitating Data Integration: A Conversion Function Manager Service

4

Page 6: Facilitating Data Integration:  A Data Conversion Function Service

Motivation (III)● Previous Approaches:

– Common Context implicitly assumed everywhere– Data Conversion Functions scattered among

participating applications● Same Conversion Functions written many times and

in many programming languages● New participants with previously unknown context

may impact on apps– Data Conversion Functions attached to the

exchanged data● Data and behaviour together

Contents Motivation Language Infrastructure Conclusions & Future Work

Facilitating Data Integration: A Conversion Function Manager Service

5

Page 7: Facilitating Data Integration:  A Data Conversion Function Service

Our Approach

● Eliminate Conversion Functions (CF) from participants applications

● Offer Conversion Functions as a Service (CFM)– CFs are written only once– CFs are reusable– Simplify the definition of CFs according to a

classification● Minimisation of the transmitted data

Contents Motivation Language Infrastructure Conclusions & Future Work

Facilitating Data Integration: A Conversion Function Manager Service

6

Page 8: Facilitating Data Integration:  A Data Conversion Function Service

< 12

.34,

“€”

, “U

S$”>

19

CONTEXT USA

<P

rice

Per

Un

it,

12.3

4, “

EU

RO

”>

CONTEXT EUROPE

<PricePerUnit, 12.34, “EURO”>

EURO

Common Vocabulary assumed

Contents Motivation Language Infrastructure Conclusions & Future Work

conv

ert

CFMService

Facilitating Data Integration: A Conversion Function Manager Service

7

Page 9: Facilitating Data Integration:  A Data Conversion Function Service

Our Approach (II)

● Two main aspects we take into account– Specification of data conversions (Language)– Conversion Manager service’s infrastructure

Contents Motivation Language Infrastructure Conclusions & Future Work

Facilitating Data Integration: A Conversion Function Manager Service

8

Page 10: Facilitating Data Integration:  A Data Conversion Function Service

Language● Simple language oriented for the specification of

CF– Homogenize the description of CF– Avoid re-writing the same function in different prog.

langs● Reducing development and maintenance costs

● Different kinds of conversion:– Mathematic transformation-based (Time independent)– Time dependent– Lossy transformations– String handling– Mapping tables

Contents Motivation Language Infrastructure Conclusions & Future Work

Facilitating Data Integration: A Conversion Function Manager Service

9

Page 11: Facilitating Data Integration:  A Data Conversion Function Service

Language ElementsTables mapping among Strings (Synonyms, implicit

graph construction)● Measurement Systems’ conversion (e.g. imperial

metric)– Graph: specify conversion among elements of

systems– Scale: conversion within a system

● (e.g. m dm : scale = 10)– Bridge: conversion between systems

● (e.g. in m : bridge = 0.0254)length { #scale 10; #default_ctx LC km (1000)m dm cm mile (63,360)in in (0.0254)m}

map length { km[LC] kilometer [FN]; dm[LC] decimeter [FN]; in[LC] inch[FN];

mi[LC] mile[FN]; m[LC] meter[FN]; cm[LC] centimeter[FN]}

Contents Motivation Language Infrastructure Conclusions & Future Work

dmmkm

cm

mi

in

meter decimeter

centimeter

inch

kilometer

Facilitating Data Integration: A Conversion Function Manager Service

10

1000

10

10

0,0254

63,3

60mile

Page 12: Facilitating Data Integration:  A Data Conversion Function Service

Conversion function call: convert 10 celsius into reaumur

convert(Temperature , 10,CELSIUS, R) returns 8

#input

Time Independent

Length { #scale 10 km (1000)m dm cm mm (1000)um}

Temperature { #default_ctx LC

C (( #input - 32 ) * 5 / 9) F K (#input + 273.15) C RA (#input * 1.8) K R (#input * 0.8) C

export map temp_codes { C[LC] CELSIUS[FN]; KELVIN[FN] K[LC]; R[LC] REAUMUR[FN]; RA[LC] RANKINE[FN] }}

Conversion function call: convert 10 km into m

convert(Length, 10, km, m) returns 10000

Contents Motivation Language Infrastructure Conclusions & Future Work

Conversion Function for the metric system

Facilitating Data Integration: A Conversion Function Manager Service

11

Conversion Function for some temperature unitsconvert(functionName, inputValue, sourceCtx, targetCtx)

Page 13: Facilitating Data Integration:  A Data Conversion Function Service

Time Dependentcurrency { var src_ctx, dst_ctx;

/* The external currency's conversion service works in 3LC context */ src_ctx = $$currency[][3LC](#src_ctx); dst_ctx = $$currency[][3LC](#dst_ctx); #connect(currencyExternService.cfg, [#input, src_ctx, dst_ctx]);

map currency { EUR[3LC] EURO[FN] €[SYMBOL] LEU[3LC] LEU[FN] LEU[SYMBOL] USD[3LC] DOLLAR[FN] $[SYMBOL] }}

convert 17 € into $ :

convert(currency, 17, EURO, $)

#input #src_ctx #dst_ctx

Contents Motivation Language Infrastructure Conclusions & Future Work

Facilitating Data Integration: A Conversion Function Manager Service

12

Page 14: Facilitating Data Integration:  A Data Conversion Function Service

Lossy Transformations

map clothing_size { ^0-10[ESP] S[EUR] 4[USA] _11-20[ESP] M[EUR] 6[USA] =21-40[ESP] L[EUR] 8[USA] 41-60[ESP] X[EUR] 10[USA]}clothing_size { $$clothing_size[#src_ctx][#dst_ctx](#input)}

convert X from context EUR into context ESP:

convert(clothing_size,X,EUR,ESP) returns 41-60

convert 4 from context USA into context ESP:

convert(clothing_size,4,USA,ESP) returns 10

Contents Motivation Language Infrastructure Conclusions & Future Work

Facilitating Data Integration: A Conversion Function Manager Service

13

Page 15: Facilitating Data Integration:  A Data Conversion Function Service

String Handling Features

date { #separators “/“, “.” #allowed_ctx “USA”, “GER”, “SPA”

var month, day, year ; year =$2 ;

if (#src_ctx = “USA") {month=$0 ;day=$1} else {month=$1 ;day=$0}

if (#dst_ctx = “USA") return month “/“ day “/“ year else if (#dst_ctx = “GER") return day “.“ month “.“ year else if (#dst_ctx = “SPA") return day “/“ month “/“ year}

Contents Motivation Language Infrastructure Conclusions & Future Work

converts date input from USA format into GER format

convert(date, “07/01/2005”, USA, GER) returns 01.07.2005

Conversion function among date formats

Facilitating Data Integration: A Conversion Function Manager Service

14

Page 16: Facilitating Data Integration:  A Data Conversion Function Service

Components of the Infrastructure

SERVER SIDECLIENT SIDE

MA

ST

ER

REPOSITORY

CL

IEN

T

LIBRARY

● Client-side– Caching of Conversion Functions– Warm-up cache of Conversion Functions– Connection Management

Contents Motivation Language Infrastructure Conclusions & Future Work

Facilitating Data Integration: A Conversion Function Manager Service

15

Page 17: Facilitating Data Integration:  A Data Conversion Function Service

Components of the Infrastructure (II)

CL

IEN

T

LIBRARY

CL

IEN

TLIBRARY

MA

ST

ER

REPOSITORY

MA

ST

ER

REPOSITORY

MA

ST

ER

REPOSITORY

FO

RW

AR

DE

RF

OR

WA

RD

ER

● Server-side– Forwarder: transparent connection to servers

● Load balance, No single point of failure

– Server: Warm-up caches, petition´s relay (with/without load)

Contents Motivation Language Infrastructure Conclusions & Future Work

Facilitating Data Integration: A Conversion Function Manager Service

16

Page 18: Facilitating Data Integration:  A Data Conversion Function Service

Architecture

Conversion FunctionRepository

Syntax Checker

Conversion Manager

Code Generator

EJB

AP

I

WS

AP

I

Functionality Layer Access Layer

INSERT

CL

IEN

T A

PP

LIC

AT

ION

Contents Motivation Language Infrastructure Conclusions & Future Work

CONVERT

Facilitating Data Integration: A Conversion Function Manager Service

17

Page 19: Facilitating Data Integration:  A Data Conversion Function Service

● Conclusions– The cost of data integration is minimized– Conversion Functions have been unified and

formalized ● Conversion functions are now defined just once● Conversion-oriented language

– Infrastructure has been built● Smoothly integrated within the Java platform

● Future Work– Administrative interface using the given API– Multiple Vocabularies support– Benchmarks

Contents Motivation Language Infrastructure Conclusions & Future Work

Facilitating Data Integration: A Conversion Function Manager Service

18

Page 20: Facilitating Data Integration:  A Data Conversion Function Service
Page 21: Facilitating Data Integration:  A Data Conversion Function Service

Connection File

protocol = WSservice_endpoint = "http://nagoya.apache.org:5049/axis/servlet/AxisServlet"Operation_name = "doEcho"

protocol = EJB

jndi_factory = "org.jnp.interfaces.NamingContextFactory"jndi_provider = "jnp://localhost:1099"jndi_factory_pkgs = "org.jboss.naming:org.jnp.interfaces“

service_endpoint = "ConversionElementAccessBean"operation_name = "doEcho"

Page 22: Facilitating Data Integration:  A Data Conversion Function Service

A Complex Conversion Function Invocation

< FlightOffer, { <ClassOfService, "Y", {<ClassOfServiceCode,"OneLetterClassCode">} >,

< Price, 1430, {<Currency, "USD">, <Scale, 1> } >,< FlightSegment, {

< FlightNumber, 400 >,< AirlineIdentifier, "Lufthansa", {<AirlineIdentifierCode, "FullAirlineName"> }>,< DepartureDate, "Jun 06, 1998", {<DateFormat, "Mon DD, YYYY"> }>,< DepartureTime, "10:35 AM", {<TimeFormat, "HH:MM AM/PM"> } >,< DepartureAirport, "FRA", {<AirportIdentifierCode, "ThreeLetterCode"> }>,< ArrivalAirport, "JFK", {<AirportIdentifierCode, "ThreeLetterCode"> }>,< ArrivalTime, "01:00 PM", {<TimeFormat, "HH:MM AM/PM">} >,< Distance, 3850, {<Unit, "mile">, <Scale, 1> } >

} >}>

FlightOffer Complex ObjectClassOfService String; DOMAIN = {Y,U,D}Price RealCurrency String; DOMAIN={USD, EUR,...}Scale IntegerFlightSegment Complex ObjectFlightNumber Integer ... ... …

cv(flight_offer, {<Currency, EUR>,<scale,10>}) == cv(ClassOfService, {<Currency, EUR>,<scale,10>}) + cv(Price, {<Currency, EUR>,<scale,10>}) +

cv(FlightSegment, {<Currency, EUR>,<scale,10>}) ==........convert(convert(1430, Currency, USD, EUR), Scale, 1, 10) == convert(convert(1120, Scale, 1, 10) == 112........