Data Integration with CloverETL
-
Upload
fariz-darari -
Category
Technology
-
view
2.023 -
download
7
Transcript of Data Integration with CloverETL
![Page 2: Data Integration with CloverETL](https://reader038.fdocuments.us/reader038/viewer/2022103016/55506733b4c90574428b566e/html5/thumbnails/2.jpg)
Fariz Darari (FU Bolzano) [email protected]
2
Outline
1. Information Integration2. CloverETL3. Demo– Global Schema– Data Sources– Queries
![Page 4: Data Integration with CloverETL](https://reader038.fdocuments.us/reader038/viewer/2022103016/55506733b4c90574428b566e/html5/thumbnails/4.jpg)
Fariz Darari (FU Bolzano) [email protected]
4
Information Integration
II has the aim to provide uniform access to data that are stored in a number of autonomous and heterogeneous sources.
![Page 5: Data Integration with CloverETL](https://reader038.fdocuments.us/reader038/viewer/2022103016/55506733b4c90574428b566e/html5/thumbnails/5.jpg)
Fariz Darari (FU Bolzano) [email protected]
5
Challenges
• Different data models (structured, semi-structured, text)
• Different schemata• Differences in the representation of – values (km vs. miles, USD vs. EUR)– entities (addresses, dates, etc.)
• Inconsistencies among the data
![Page 6: Data Integration with CloverETL](https://reader038.fdocuments.us/reader038/viewer/2022103016/55506733b4c90574428b566e/html5/thumbnails/6.jpg)
Fariz Darari (FU Bolzano) [email protected]
6
Components
• Consists of:1. Global Schema
The unifying schema among local schemata.2. Wrappers
Wrappers make sources accessible. 3. Mediators
Translate queries, combine answers of wrappers and other mediators.
![Page 7: Data Integration with CloverETL](https://reader038.fdocuments.us/reader038/viewer/2022103016/55506733b4c90574428b566e/html5/thumbnails/7.jpg)
Fariz Darari (FU Bolzano) [email protected]
7
Information Integration - GAV
• An approach of mapping source schemata and global schema
• GAV = relations in the global schema are views of the sources
• Views are virtual relations, the global schema describes a virtual DB
![Page 12: Data Integration with CloverETL](https://reader038.fdocuments.us/reader038/viewer/2022103016/55506733b4c90574428b566e/html5/thumbnails/12.jpg)
Fariz Darari (FU Bolzano) [email protected]
12
CloverETL
• An Open Source based platform for information integration.
• Data can be:– extracted from any number of sources– validated and modified along the way– written to one or more destinations.
![Page 16: Data Integration with CloverETL](https://reader038.fdocuments.us/reader038/viewer/2022103016/55506733b4c90574428b566e/html5/thumbnails/16.jpg)
Fariz Darari (FU Bolzano) [email protected]
16
CloverETL - Designer
• Transformation graphs are created in CloverETL Designer.
• Tranformation graphs are divided into:– Extract (Green)– Transformation (Yellow)– Load (Blue)
• The edges correspond to the data flows from data sources to data targets.
![Page 19: Data Integration with CloverETL](https://reader038.fdocuments.us/reader038/viewer/2022103016/55506733b4c90574428b566e/html5/thumbnails/19.jpg)
Fariz Darari (FU Bolzano) [email protected]
19
Global Schema - Example
• Student(sid, sname, age, nationality)• Country(cid, cname, currency)
![Page 20: Data Integration with CloverETL](https://reader038.fdocuments.us/reader038/viewer/2022103016/55506733b4c90574428b566e/html5/thumbnails/20.jpg)
Fariz Darari (FU Bolzano) [email protected]
Data Sources
• Unibz (Bolzano), from Relational DB– StudentBZ(id, name, sex, age, nationality, address)
• Unitr (Trento), from XML– StudentTR(id, full_name, age, nationality)
• Unimi (Milan), from CSV– StudentMI(student_id, name, gender, age, citizenship)
• UN (United Nations), from Excel– CountryUN(id, country_name, population, capital, currency)
20
![Page 21: Data Integration with CloverETL](https://reader038.fdocuments.us/reader038/viewer/2022103016/55506733b4c90574428b566e/html5/thumbnails/21.jpg)
Fariz Darari (FU Bolzano) [email protected]
21
Data Sources - Mapping
• Student(sid, sname, age, nationality) :- StudentBZ(sid, sname, _, age, nationality, _)
• Student(sid, sname, age, nationality) :- StudentTR(sid, sname, age, nationality)
• Student(sid, sname, age, nationality) :- StudentMI(sid, sname, _, age, nationality)
• Country (cid, cname, currency) :-CountryUN(cid, cname, _, _, currency)
![Page 22: Data Integration with CloverETL](https://reader038.fdocuments.us/reader038/viewer/2022103016/55506733b4c90574428b566e/html5/thumbnails/22.jpg)
Fariz Darari (FU Bolzano) [email protected]
22
Queries1. All students with their information.
q(sid, sname, age, nationality) :- Student(sid, sname, age, nationality).
2. All students whose age is more than 22.q(sid, sname) :-
Student(sid, sname, age, nationality), age > 22.3. All students with their nationality’s currency.
q(sid, sname, age, nationality, currency) :- Student(sid, sname, age, nationality), Country(cid, nationality, currency).
4. The number of students per country.SELECT nationality, count(sid) FROM Student
GROUP BY nationality
![Page 23: Data Integration with CloverETL](https://reader038.fdocuments.us/reader038/viewer/2022103016/55506733b4c90574428b566e/html5/thumbnails/23.jpg)
Fariz Darari (FU Bolzano) [email protected]
23
Demo
• Query:q(sid, sname) :-
Student(sid, sname, age, nationality), age > 22.• Logical Plans:q(sid, sname) :-
StudentBZ(sid, sname, _, age, nationality, _), age > 22.q(sid, sname) :-
StudentTR(sid, sname, age, nationality), age > 22.q(sid, sname) :-
StudentMI(sid, sname, _, age, nationality), age > 22.
![Page 25: Data Integration with CloverETL](https://reader038.fdocuments.us/reader038/viewer/2022103016/55506733b4c90574428b566e/html5/thumbnails/25.jpg)
Fariz Darari (FU Bolzano) [email protected]
25
References
• http://www.cloveretl.com/• http://www.inf.unibz.it/~nutt/InfInt1112/