Database 2 - WordPress.com...We created a database on a server (survey.cs.unicam.it) and uploaded...

33
Database 2 Diego Cervellini Riccardo Pancotti

Transcript of Database 2 - WordPress.com...We created a database on a server (survey.cs.unicam.it) and uploaded...

Page 1: Database 2 - WordPress.com...We created a database on a server (survey.cs.unicam.it) and uploaded there our "clean" and modified tables. Tool used: MySQLWorkbench to re-create indexes

Database 2

Diego CervelliniRiccardo Pancotti

Page 2: Database 2 - WordPress.com...We created a database on a server (survey.cs.unicam.it) and uploaded there our "clean" and modified tables. Tool used: MySQLWorkbench to re-create indexes

General Index

● Introduction to Data Warehousing● Initial goals● Date Warehousing phases● Obtained reports● Required indexes● Conclusions

Page 3: Database 2 - WordPress.com...We created a database on a server (survey.cs.unicam.it) and uploaded there our "clean" and modified tables. Tool used: MySQLWorkbench to re-create indexes

First of all - What is a Database?

● A database is an organized collection of data● Data are organized in models to be easily

queried● Most important aspects are accuracy,

availability, usability and resilience● It's not useful for detailed analysis aimed at

planning and decision making● Possible solution?

Data Warehousing

Page 4: Database 2 - WordPress.com...We created a database on a server (survey.cs.unicam.it) and uploaded there our "clean" and modified tables. Tool used: MySQLWorkbench to re-create indexes

What is Data Warehousing?

● Data Warehousing consists in a set of methods, tools and technologies to assist the knowledge worker to carry out data analysis.

● It can starts from:○ an existing corporate database○ the Company Information Systems ○ data coming outside the corporate

Page 5: Database 2 - WordPress.com...We created a database on a server (survey.cs.unicam.it) and uploaded there our "clean" and modified tables. Tool used: MySQLWorkbench to re-create indexes

Data Warehouse

Data Warehouse works as a repository used for reporting and analysis.

It has the following characteristics:● oriented to the subject of interest● integrated and consistent● representative of temporal evolution● non-volatile

Page 6: Database 2 - WordPress.com...We created a database on a server (survey.cs.unicam.it) and uploaded there our "clean" and modified tables. Tool used: MySQLWorkbench to re-create indexes

Benefits of Data Warehouse

● Maintain data history● Integrate data from multiple source systems● Provide a single data model for all data● Improve data quality● Restructure the data so that it delivers excellent query

perfomance● OLAP vs OLTP

OLTP (On-line transactional processing)

OLAP (On-line Analytical Processing)

● Dynamic and multidimensional analysis.

● Works better with huge amount of data, summing up the performance of an enterprise.

● Interactivity is essential

● Transactions that read/write a small number of tuples from/to many tables connected by simple relations

● The workload core is "frozen", no interactivity.

Page 7: Database 2 - WordPress.com...We created a database on a server (survey.cs.unicam.it) and uploaded there our "clean" and modified tables. Tool used: MySQLWorkbench to re-create indexes

Initial goals

Initial goals of our course were:

● Creation of a data warehouse from ESSE3 database

● Data extraction to obtain indexes● Report creation

Page 8: Database 2 - WordPress.com...We created a database on a server (survey.cs.unicam.it) and uploaded there our "clean" and modified tables. Tool used: MySQLWorkbench to re-create indexes

Phases in Data Warehousing

Major phases in Data Warehousing:

● Extraction

● Cleaning

● Transformation

● Loading

Page 9: Database 2 - WordPress.com...We created a database on a server (survey.cs.unicam.it) and uploaded there our "clean" and modified tables. Tool used: MySQLWorkbench to re-create indexes

Tools used

● SquirrelSql & Dbeaver: Sql clients used to analyze Esse3

● Pentaho Suite: open source BI suite with ETL and reporting capabilities

● MysqlWorkbench: Database design and administration tool, used to manage our local repository

Page 10: Database 2 - WordPress.com...We created a database on a server (survey.cs.unicam.it) and uploaded there our "clean" and modified tables. Tool used: MySQLWorkbench to re-create indexes

Extraction

In this phase relevant data are extracted from data source.

The choice of the data to be extracted is mainly based on their quality.

Page 11: Database 2 - WordPress.com...We created a database on a server (survey.cs.unicam.it) and uploaded there our "clean" and modified tables. Tool used: MySQLWorkbench to re-create indexes

Our Extraction

What have we done?Downloaded some useful tables from ESSE3 database, according to our goals and the suggestions of ESSE3 developers.

Tools used:● SquirrelSQL to obtain the SQL structure of the DB● Pentaho suite to download the tables from ESSE3 to

our local database.● MySQLWorkbench to create and manage our local

database.

Page 12: Database 2 - WordPress.com...We created a database on a server (survey.cs.unicam.it) and uploaded there our "clean" and modified tables. Tool used: MySQLWorkbench to re-create indexes

Cleaning

Cleaning is used to improve the quality of thedata sources.It's about deleting and/or leaving out:● duplicate data● missing data● inconsistency between logical associated

values● ...

Page 13: Database 2 - WordPress.com...We created a database on a server (survey.cs.unicam.it) and uploaded there our "clean" and modified tables. Tool used: MySQLWorkbench to re-create indexes

Our Cleaning

What have we done?We cutted all data that were inserted before 2008, because they are not useful for our purposes.

Tools used:● MySQLWorkbench to delete all unnecessary data.

Page 14: Database 2 - WordPress.com...We created a database on a server (survey.cs.unicam.it) and uploaded there our "clean" and modified tables. Tool used: MySQLWorkbench to re-create indexes

Transformation

Converts data from operational source format to that of DW. The correspondence with the source level is complicated by the presence of distinct sources heterogeneous, requiring a complex integration phase.

Page 15: Database 2 - WordPress.com...We created a database on a server (survey.cs.unicam.it) and uploaded there our "clean" and modified tables. Tool used: MySQLWorkbench to re-create indexes

Our Transformation

What have we done?We have changed the engine of tables (from Oracle one to InnoDB).We created indexes of each table.We linked the tables creating the foreign keys.

Tools used:● MySQLWorkbench to manage the tables changes

Page 16: Database 2 - WordPress.com...We created a database on a server (survey.cs.unicam.it) and uploaded there our "clean" and modified tables. Tool used: MySQLWorkbench to re-create indexes

Loading

The loading of data into the DW ● Refresh: DW data are written in full,

replacing the previous ones (technique used to originally populate the DW)

● Update: only changes occurring in source data are added in DW (technique used for the periodic update of DW)

Page 17: Database 2 - WordPress.com...We created a database on a server (survey.cs.unicam.it) and uploaded there our "clean" and modified tables. Tool used: MySQLWorkbench to re-create indexes

Our Loading

What have we done?We created a database on a server (survey.cs.unicam.it) and uploaded there our "clean" and modified tables.

Tool used:● MySQLWorkbench to re-create indexes and foreign

keys ● Pentaho suite to upload tables on the server

Page 18: Database 2 - WordPress.com...We created a database on a server (survey.cs.unicam.it) and uploaded there our "clean" and modified tables. Tool used: MySQLWorkbench to re-create indexes

Obtained Reports

We worked on and analyzed our cleaned tables to try to retrieve some useful data that can influence the decision making process.

In this way we could give some useful information about Unicam, making the decision planning easier and faster.

Page 19: Database 2 - WordPress.com...We created a database on a server (survey.cs.unicam.it) and uploaded there our "clean" and modified tables. Tool used: MySQLWorkbench to re-create indexes

Obtained Reports

1. Situation of first year exams of some faculties

2. Foreign students on total students percentage

3. Situation of exams between italian and foreign students

Situation of marks average between italian and foreign students

Page 20: Database 2 - WordPress.com...We created a database on a server (survey.cs.unicam.it) and uploaded there our "clean" and modified tables. Tool used: MySQLWorkbench to re-create indexes

First year exams Pharmacy

Page 21: Database 2 - WordPress.com...We created a database on a server (survey.cs.unicam.it) and uploaded there our "clean" and modified tables. Tool used: MySQLWorkbench to re-create indexes

First year exams Computer Science

Page 22: Database 2 - WordPress.com...We created a database on a server (survey.cs.unicam.it) and uploaded there our "clean" and modified tables. Tool used: MySQLWorkbench to re-create indexes

First year exams Law faculty

Page 23: Database 2 - WordPress.com...We created a database on a server (survey.cs.unicam.it) and uploaded there our "clean" and modified tables. Tool used: MySQLWorkbench to re-create indexes

Passed exams by Italian students

Page 24: Database 2 - WordPress.com...We created a database on a server (survey.cs.unicam.it) and uploaded there our "clean" and modified tables. Tool used: MySQLWorkbench to re-create indexes

Passed exams by foreign students

Page 25: Database 2 - WordPress.com...We created a database on a server (survey.cs.unicam.it) and uploaded there our "clean" and modified tables. Tool used: MySQLWorkbench to re-create indexes

Italian students marks average

Page 26: Database 2 - WordPress.com...We created a database on a server (survey.cs.unicam.it) and uploaded there our "clean" and modified tables. Tool used: MySQLWorkbench to re-create indexes

Foreign students marks average

Page 27: Database 2 - WordPress.com...We created a database on a server (survey.cs.unicam.it) and uploaded there our "clean" and modified tables. Tool used: MySQLWorkbench to re-create indexes

Percentage of foreign students on total from 2008

Page 28: Database 2 - WordPress.com...We created a database on a server (survey.cs.unicam.it) and uploaded there our "clean" and modified tables. Tool used: MySQLWorkbench to re-create indexes

Percentage of foreign student on total by year

Page 29: Database 2 - WordPress.com...We created a database on a server (survey.cs.unicam.it) and uploaded there our "clean" and modified tables. Tool used: MySQLWorkbench to re-create indexes

Calculating Indexes

One of the goals of our course was to calculate two different indexes for the FFO (Fondo di finanziamento ordinario).● A1: Atot = RAP * ( KA + KT )

● A2: University's weighted CFU / National's weighted CFU

Active studentsRegion wealth function

0,98

Number of Teacher /Courses 0,85

Page 30: Database 2 - WordPress.com...We created a database on a server (survey.cs.unicam.it) and uploaded there our "clean" and modified tables. Tool used: MySQLWorkbench to re-create indexes

A1-Index

RAP = 5.092KT = 0,98KA = 0,85National Atot = ?

Atot = RAP*(KA+KT) = 9318,36

A1 = Local Atot/National Atot = ?

Page 31: Database 2 - WordPress.com...We created a database on a server (survey.cs.unicam.it) and uploaded there our "clean" and modified tables. Tool used: MySQLWorkbench to re-create indexes

A2-Index

Acquired CFU = 171.058Expected CFU = 294.178MNG = 0,43National Weighted CFU = ?

PCFU = Expected CFU/Acquired CFU = 1,719755872Weight = PCFU/MNG = 3,999432261Weighted CFU = Weight*Acquired CFU = 684134,88372093

A2 = Local Weighted CFU/National Weighted CFU= ?

Page 32: Database 2 - WordPress.com...We created a database on a server (survey.cs.unicam.it) and uploaded there our "clean" and modified tables. Tool used: MySQLWorkbench to re-create indexes

Conclusions

● We didn't managed to make a data-warehouse properly but just a collection of data-marts and some reports about it.

● We faced a lot of problems due to the inconsistency of ESSE3 database and its documentation, that sometimes didn't seem so clarifying and helpful.

● On the other hand we obtained useful reports and we realized how to work in team on such a "problematic" task.

Page 33: Database 2 - WordPress.com...We created a database on a server (survey.cs.unicam.it) and uploaded there our "clean" and modified tables. Tool used: MySQLWorkbench to re-create indexes

THANKS FOR

YOUR ATTENTION!