Implementation of DB Ex -1

12
Implementation of Database Exercise 1 Tanmaya Mahapatra Matriculation Number : 340959 [email protected] Bharath Rangaraj Matriculation Number : 340909 [email protected] Manasi Jayapal Matriculation Number : 340892 [email protected] October 28, 2013 1 Exercise 1.1 : Database Architecture 1.1 Name each of the five layers in the database architecture specified in the lecture, explain the concepts handled in each layer, and the interfaces between layers. The five layers in the Database Architecture are : Logical Data Structures Logical Access Paths/Structures Storage Structures Propagation Control/Page Assignment File Services/Memory Assignment Structures The important concepts handled in each layer together with the interfaces between each layer is shown in the Table 1 1

description

RWTH Exercise 1 Solutions

Transcript of Implementation of DB Ex -1

Implementation of Database

Exercise 1

Tanmaya Mahapatra

Matriculation Number : 340959

[email protected]

Bharath Rangaraj

Matriculation Number : 340909

[email protected]

Manasi Jayapal

Matriculation Number : 340892

[email protected]

October 28, 2013

1 Exercise 1.1 : Database Architecture

1.1 Name each of the five layers in the database architecturespecified in the lecture, explain the concepts handled ineach layer, and the interfaces between layers.

The five layers in the Database Architecture are :

• Logical Data Structures

• Logical Access Paths/Structures

• Storage Structures

• Propagation Control/Page Assignment

• File Services/Memory Assignment Structures

The important concepts handled in each layer together with the interfacesbetween each layer is shown in the Table 1

1

1EXERCIS

E1.1

:DATABASE

ARCHIT

ECTURE

2

Table 1: Concepts & Interfaces in Different Layers

of Database.

Layer Name Concepts Handled Interface Between Layers

Logical Data Structure Roles Played

• Prepares Query ExecutionPlan : Translates & OptimizeQueries.

– Addressing Units

∗ Tables

∗ Views

∗ Tuples

– Auxiliary Structures

∗ External SchemaDescription

– Addressing Units

∗ External Records

∗ Sets

∗ Keys

∗ Access Paths

Interface

• Users interact using a Set-orientedInterface (SQL)

Continued on next page

1.1

Nameeach

ofth

efivelayers

inth

edata

base

arch

itec-

ture

specified

inth

electu

re,explain

theconcepts

handled

ineach

layer,

and

theinterfa

cesbetw

een

layers.

1EXERCIS

E1.1

:DATABASE

ARCHIT

ECTURE

3

Table 1 – continued from previous page

Layer Name Concepts Handled Interface Between Layers

Logical Access Path-s/Structures

Roles Played

• Manage Cursor

• Sort Components

• Manages Dictionary

Interface

• The Higher Layer interacts using aRecord Oriented DB Interface

Storage Structures Roles Played

• Manage Records & Index

• Consists of Auxiliary struc-tures like Page Indexes.

• Consists of Addressing Unitslike Pages & Segments

Interface

• Internal Record Interface (Stores

Records in B* Trees)

Continued on next page

1.1

Nameeach

ofth

efivelayers

inth

edata

base

arch

itec-

ture

specified

inth

electu

re,explain

theconcepts

handled

ineach

layer,

and

theinterfa

cesbetw

een

layers.

1EXERCIS

E1.1

:DATABASE

ARCHIT

ECTURE

4

Table 1 – continued from previous page

Layer Name Concepts Handled Interface Between Layers

Propagation Control/-Page Assignment

Roles Played

• Manages Buffer & Segments

• Consists of Auxiliary struc-tures like Page Tables, BlockTables etc.

• Consists of Addressing Unitslike Blocks & Files.

Interface

• DB Buffer Interface

Continued on next page

1.1

Nameeach

ofth

efivelayers

inth

edata

base

arch

itec-

ture

specified

inth

electu

re,explain

theconcepts

handled

ineach

layer,

and

theinterfa

cesbetw

een

layers.

1EXERCIS

E1.1

:DATABASE

ARCHIT

ECTURE

5

Table 1 – continued from previous page

Layer Name Concepts Handled Interface Between Layers

File Services/Memory As-signment Structures

Roles Played

• Manages Files & ExternalMemory.

• Consists of Addressing Unitslike tracks, cylinders & chan-nels.

• Consists of Auxiliary struc-tures like File Catalogues,Free-Placement etc.

Interface

• Memory Assignment Structures

• File Interface (Manages read/writeBlock)

1.1

Nameeach

ofth

efivelayers

inth

edata

base

arch

itec-

ture

specified

inth

electu

re,explain

theconcepts

handled

ineach

layer,

and

theinterfa

cesbetw

een

layers.

1 EXERCISE 1.1 : DATABASE ARCHITECTURE 6

1.2 The following tasks belong to different layers, sort themso that they match the architecture top-down

.

1. Buffering

2. Logical Relation and Cursor Management

3. Media Access

4. Access Path Management

5. View Formulation and Management

Solution

1. Buffering =⇒ Propagation Control/Page Assignment

2. Logical Relation and Cursor Management =⇒ Logical Access Path-s/Structures

3. Media Access =⇒ File Services/Memory Assignment Structures

4. Access Path Management =⇒ Storage Structures

5. View Formulation and Management =⇒ Logical Data Structures

Tasks sorted so that they match the architecture (top-down)

1. View Formulation and Management

2. Logical Relation and Cursor Management

3. Access Path Management

4. Buffering

5. Media Access

1.3 What does data independence mean ? Why is it animportant feature of database systems ? Give examplesfor how data independence is achieved in the five-layeredarchitecture!

Data Independence

Data Independence means application programs are insulated from changesin the way the data is structured and stored. Data independence is achievedthrough use of the three levels of data abstraction; in particular, the concep-tual schema and the external schema provide distinct benefits in this area.There are 2 types of Data Independence :

1.2 The following tasks belong to different layers, sortthem so that they match the architecture top-down

1 EXERCISE 1.1 : DATABASE ARCHITECTURE 7

Logical Data Independence The ability to change the logical schemawithout changing the external schema is called as Logical Data In-dependence.

Physical Data Independence The ability to change the physical schemawithout changing the logical schema is called as Physical Data Inde-pendence.

Important Feature

Data Independence is regarded as an important feature of Database Systemsbecause :

• It facilitates improvement of various layers and changes to variouslayers without affecting/impacting other layers.

• Applications are insulated from how data are structured and stored.

• It provides protection from changes in logical structure of Data.

• It provides protection from changes in physical structure of Data.

Examples

1. Data independence is achieved by abstracting each higher level to itslower lever that is layer i+1 is the abstraction of layer i and datarepresentation is unique to each layer and the data of each layer ishidden from the other layers.

2. The layers communicate by invoking the abstract methods and whilecommunication the data must be converted to the layer specific formsince each layer consists of unique data representation and data rep-resentation of each layer is hidden from other layers.

Layers Abstraction Provided

Logical Data Structures Position indicator and explicitrelations in the table.

Logical access path Number and kind of physicalaccess paths and internal rep-resentation of tables.

Storage Structures Management of Buffers andLogging.

Page Assignment Structures File Maping and Indirect PageAssignment.

Memory Assignment Structures Technical features and techni-cal details of external media.

1.3 What does data independence mean ? Why is it animportant feature of database systems ? Give examplesfor how data independence is achieved in the five-layeredarchitecture!

1 EXERCISE 1.1 : DATABASE ARCHITECTURE 8

Scenario To understand how Data Independence is achieved in the 5 layermodel we consider a scenario in which a user/program wants some specificinformation about a particular Book and Author.

1. The User interacts with the top most layer of the DB (L5) with thehelp of a SQL. The user has no idea of where the data or how the datais actually stored.SELECT B.Title A.Author from Books B, Author A where B.aid = A.aid;

2. It prepares a Query Execution plan and the operations of it are directlyinvoked at L4 Interface.

3. The L4 layer plans a sort/merge join for the query evaluation. Sortedobjects have to be created explicitly in L4, before the join can beprocessed.

4. The Index-scans, which deliver the records to be sorted, fetch themvia physical access paths and storage structures managed by L3.

5. L3 offers a variety of storage structures which physically embody theindexes or other types of access paths.

6. The functionality provided by L3 needs to refer to the physical Data ofDB. The DB Buffer acts as an interface to the DB on external devicesand provides access to pages based on logical page references.

7. L1 encapsulates number, type and location of external devices. Ittogether with the OS file Management process actually fetches theData stored somewhere on the storage Disk.

The different levels perform different operations without actually knowingwhat other layers are actually doing. The user submitting the query does notknow that the simple SQL is broken into QEP and other complicated stufffinally leading to very low level data access mechanism. This demonstrates”Data Independence” in the 5 layered Database Model.

• The addition or removal of new entities, attributes or relationships tothe conceptual schema should be possible without having to changeexisting external schemas or having to rewrite existing application pro-grams.

• A change to the internal schema such as using a different file orga-nizations or storage structures, storage devices or indexing strategyshould be possible without having to change the conceptual or exter-nal schema

1.3 What does data independence mean ? Why is it animportant feature of database systems ? Give examplesfor how data independence is achieved in the five-layeredarchitecture!

2 EXERCISE 1.2 : QUERY LANGUAGES 9

2 Exercise 1.2 : Query Languages

The following relations are given:

• lives(pname,city,street) which contains for every person the location(s)he lives,

• works(pname,cname,salary) which contains for every person the nameof the company that the person works for, as well as his salary,

• located(cname,city) which contains the locations for every company(i.e. a company can be located in more than one city),

• boss(pname,mname), which contains the persons that are supervisedby a manager.

Formulate the following queries as expressions in relational algebra, tuplerelational calculus,domain relational calculus and SQL:

2.1 Find the names of all persons who are working in thesame company as their boss and get in this company ahigher salary than their boss.

2.1.1 Relational Algebra

2.1.2 Tuple Relational Calculus

{w.pname | works(w) ∧ ∃ (b) (boss(b) ∧ b.pname = w.pname ∧ w.salary <

{w.salary | work(w) ∧ ∃ (b) (boss(b) ∧ b.mname = w.pname} ∧ w.cname ={w.cname | works(w) ∧ ∃ (b) (boss(b)) ∧ b.mname = w.pname}}

2.1.3 Domain Relational Calculus

{pname|(∃cname)(∃salary)(∃name1)(∃name2)(<pname, cname, salary > ∈ works∧ < name1, name2 > ∈ boss ∧ pname =name1 ∧ salary < {salary |(∃cname)(∃pname)(∃name1)(∃name2)(<pname, cname, salary > ∈ works ∧ < name1, name2 > ∈ boss ∧ name2 =pname)} ∧ cname = {cname |(∃pname)(∃salary)(∃name1)(∃name2)(<pname, cname, salary > ∈ works∧ < name1, name2 > ∈ boss ∧ name2 =pname)})}

2 EXERCISE 1.2 : QUERY LANGUAGES 10

2.1.4 SQL

1 SELECT PNAME FROM (BOSS NATURAL JOIN WORKS J )WHERE

3 SALARY > (SELECTMAX(SALARY) FROMWORKSWWHERE PNAME=MNAMEand J .CNAME = W.CNAME)

AND

5 CNAME IN (SELECT CNAME FROMWORKSWHERE PNAME=MNAME) ;

7 /∗ ANOTHER METHOD ∗/

9 SELECT EPNAME FROMWORKSWINNER JOIN

11 (SELECT PNAME AS EPNAME, CNAME AS ECNAME, SALARY AS ESALARY,MNAME as EMNAMEFROMWORKS NATURAL JOIN BOSS)

ONW.PNAME=EMNAMEANDW.CNAME=ECNAME ANDW.SALARY<ESALARY;

Listing 1: SQL Query for finding the names of persons who are working inthe same company as their Boss & earn more salary than Boss

2.2 Find the names of all persons, who work for at least twodifferent companies (≥2)

2.2.1 Relational Algebra

1. Taking the Cartesian Product of “works” table and storing the resultinto a table called “Companies” and renaming the fields to avoidduplication :ρ(companies(1→ pname1,2→ cname1,3→ salary1,4→ pname2,5→ cname2,6→ salary2),works× works))

2. Persons working for at least 2 different companies:πpname1(

σ(pname1=pname2)∧(cname1 6=cname2) (companies))

2.2.2 Tuple Relational Calculus

{w.pname | works(w) ∧ {COUNT (x.pname) | works(x)} ≥ 2}

2.2.3 Domain Relational Calculus

{< pname > |(∃ cname)(∃ salary)(< pname, cname, salary > ∈ works ∧{COUNT (pname) | (∃ cname)(∃ salary)(< pname, cname, salary > ∈works)} ≥ 2)}

2.2.4 SQL

2.2 Find the names of all persons, who work for at leasttwo different companies (≥2)

2 EXERCISE 1.2 : QUERY LANGUAGES 11

SELECT T.PNAME2 FROMWORKS T

GROUPBY T.PNAME HAVINGCOUNT(T.PNAME) >=2;

Listing 2: SQL Query for finding names of all person working for at least 2companies

2.3 Find the names of the persons with the highest salary(Note: There might be several persons with the samesalary).

2.3.1 Relational Algebra

1. Find all the salaries less than the others :πworks.salary

(σworks.salary<d.salary

(works×ρd(works)) )

2. Find the largest salary:πsalary

(works)−πworks.salary(ρworks.salary<d.salary(works)×ρd(works))

2.3.2 Tuple Relational Calculus

{w.pname | works(w) ∧ w.salary = {MAX(x.salary) | works(x)}}

2.3.3 Domain Relational Calculus

{pname|(∃cname)(∃salary)(< pname, cname, salary >∈ works∧salary ≥{MAX (salary) | (∃ pname) (∃ cname) ( < pname, cname, salary > ∈works)})}

2.3.4 SQL

1 SELECT PNAME FROMWORKSWWHERE

3 (W.SALARY IN (SELECTMAX(SALARY) FROMWORKS) ) ;

Listing 3: SQL Query for finding the names of the persons with the highestsalary.

2.3 Find the names of the persons with the highest salary(Note: There might be several persons with the samesalary).

2 EXERCISE 1.2 : QUERY LANGUAGES 12

2.4 Find the names of all companies that are located in citiesin which ’IBM’ is not located.

2.4.1 Relational Algebra

1. Choose the city corresponding to the city in which “IBM” is located :σcname=′IBM ′(located)

2. Select the companies located in cities where IBM is not located:πcname

(πcname,city−σcname=′IBM′ (located))

2.4.2 Tuple Relational Calculus

{l.name | located(l) ∧ l.city = ¬ {l.city | located(l) ∧ l.cname = ′IBM ′ }}

2.4.3 Domain Relational Calculus

{cname |(∃ city)(< cname, city > ∈ located ∧ city = ¬ {city | (∃ cname) (<cname, city > ∈ located ∧ cname = ′IBM ′) })}

2.4.4 SQL

1 SELECT CNAME FROM LOCATED L WHERE L .CITY NOT IN (SELECT CITYFROM

LOCATED WHERE CNAME = ’IBM ’ ) ;3

/∗ The Below Query i s Case− I n s e n s i t i v e ∗/5SELECT CNAME FROM LOCATED L

7 WHERE

(LOWER(TRIM(L .CITY) ) NOT IN

9 (SELECT LOWER(TRIM(CITY) ) FROM LOCATED WHERELOWER(TRIM(CNAME) ) = LOWER(TRIM( ’IBM ’ ) ) ) ) ;

Listing 4: SQL Query for finding the names of all companies that are locatedin cities in which ’IBM’ is not located.

2.4 Find the names of all companies that are located incities in which ’IBM’ is not located.