LECTURE2: DATABASE ENVIRONMENT · several reasons why this separation is desirable: ... • To...

38
1 IS220 : Database Fundamentals LECTURE2: DATABASE ENVIRONMENT Ref. Main: “Chapter2” + parts from “Chapter 15” from “Database Systems: A Practical Approach to Design, Implementation and Management.” Thomas Connolly, Carolyn Begg.

Transcript of LECTURE2: DATABASE ENVIRONMENT · several reasons why this separation is desirable: ... • To...

Page 1: LECTURE2: DATABASE ENVIRONMENT · several reasons why this separation is desirable: ... • To build and validate logical data model ... • Step 2.1 Derive relations for logical

1

I S 2 2 0 : D a t a b a s e F u n d a m e n t a l s

LECTURE2:

DATABASE ENVIRONMENT

Ref. Main: “Chapter2” + parts from “Chapter 15”

from

“Database Systems: A Practical Approach to Design, Implementation and Management.”

Thomas Connolly, Carolyn Begg.

Page 2: LECTURE2: DATABASE ENVIRONMENT · several reasons why this separation is desirable: ... • To build and validate logical data model ... • Step 2.1 Derive relations for logical

Chapter Objectives

Lecture2

2

In this chapter you will learn:

The purpose of the three-level database architecture.

The contents of the external, conceptual, and internal levels.

The purpose of the external/conceptual and the conceptual/internal mappings.

The meaning of logical and physical data independence.

The distinction between a Data Definition Language (DDL) and a Data

Manipulation Language (DML).

A classification or models of DBMS’s.

The purpose and importance of conceptual modeling.

Page 3: LECTURE2: DATABASE ENVIRONMENT · several reasons why this separation is desirable: ... • To build and validate logical data model ... • Step 2.1 Derive relations for logical

Main Terms

Lecture2

3

• Data abstraction

• Schemas and Instances

• Three-level Schema Architecture

• Mapping

• Data Independence

• Data Models

• Database system development lifecycle

• Classification of DBMSs.

Page 4: LECTURE2: DATABASE ENVIRONMENT · several reasons why this separation is desirable: ... • To build and validate logical data model ... • Step 2.1 Derive relations for logical

Data abstraction

Lecture2

4

• One fundamental characteristic of the database approach is

that it provides some level of data abstraction.

• Data abstraction generally refers to the suppression of details

of data organization and storage, and the highlighting of the

essential features for an improved understanding of data.

• Data abstraction enable different users to perceive data at

their preferred level of detail.

Page 5: LECTURE2: DATABASE ENVIRONMENT · several reasons why this separation is desirable: ... • To build and validate logical data model ... • Step 2.1 Derive relations for logical

THE THREE LEVEL SCHEMA ARCHITECTURE

Lecture2

5

Page 6: LECTURE2: DATABASE ENVIRONMENT · several reasons why this separation is desirable: ... • To build and validate logical data model ... • Step 2.1 Derive relations for logical

Three-level Architecture

Lecture2

6

The goal of the three-schema architecture, is to separate the

user applications from the physical database.

Page 7: LECTURE2: DATABASE ENVIRONMENT · several reasons why this separation is desirable: ... • To build and validate logical data model ... • Step 2.1 Derive relations for logical

Three-level Architecture

Lecture2

7

1. The external or view level

• includes a number of external schemas or user views. (the ways users perceive the data)

• Describes the part of database that is relevant to a particular user.

Page 8: LECTURE2: DATABASE ENVIRONMENT · several reasons why this separation is desirable: ... • To build and validate logical data model ... • Step 2.1 Derive relations for logical

Three-level Architecture

Lecture2

8

2. Conceptual Level

• It has a conceptual schema (logical structure of entire database)

• which describes the structure of the whole database for a community of users.

• Describes what data is stored in database and relationships among the data.

• It concentrates on describing entities, data types, relationships, user operations, and constraints.

Page 9: LECTURE2: DATABASE ENVIRONMENT · several reasons why this separation is desirable: ... • To build and validate logical data model ... • Step 2.1 Derive relations for logical

Three-level Architecture

Lecture2

9

3. Internal Level

• It has an internal schema ( the way DBMS and OS perceive the data)

• Physical representation of the database on the computer.

• How the data is stored in the database. It contains the definitions of stored records, the methods of representation, the data fields, and the indexes and storage structures used.

Page 10: LECTURE2: DATABASE ENVIRONMENT · several reasons why this separation is desirable: ... • To build and validate logical data model ... • Step 2.1 Derive relations for logical

Schemas and Instances

Lecture2

10

• In any data model, it is important to distinguish between description of database and the database itself:

• Schema (intention)

• The description of the database. It rarely changes.

• when we define a new database, we specify its schema – “The structure, data types, and the constraints that describes the database”.

• A displayed schema is called a schema diagram

• We call each object in the schema a schema construct.

• Instance (database state / extension)

• The actual data in the database at any point of time

• Changes rapidly.

• When we initially load data into the database, it is said to move into the initial state of the database.

• Each write operation (insert, delete, modify) changes the current state of the database to its new state

Page 11: LECTURE2: DATABASE ENVIRONMENT · several reasons why this separation is desirable: ... • To build and validate logical data model ... • Step 2.1 Derive relations for logical

Example

Database Concepts

11

Schema Instance

Page 12: LECTURE2: DATABASE ENVIRONMENT · several reasons why this separation is desirable: ... • To build and validate logical data model ... • Step 2.1 Derive relations for logical

Mapping

Lecture2

12

In a DBMS based on the three-schema architecture, the DBMS must

transform a request specified on an external schema into a request against

the conceptual schema, and then into a request on the internal schema for

processing over the stored database.

The processes of transforming requests and results between levels are

called mappings.

Page 13: LECTURE2: DATABASE ENVIRONMENT · several reasons why this separation is desirable: ... • To build and validate logical data model ... • Step 2.1 Derive relations for logical

Illustrating Example

Lecture2

13

Page 14: LECTURE2: DATABASE ENVIRONMENT · several reasons why this separation is desirable: ... • To build and validate logical data model ... • Step 2.1 Derive relations for logical

Reasons for Separations?

Lecture2

14

The objective of the three-level architecture is to separate each user’s view of

the database from the way the database is physically represented. There are

several reasons why this separation is desirable:

Each user should able to access the data, but have a different customized

view of data.

The DBA should be able to change the DB storage structure without

affecting the user’s view.

The internal structure of database should be unaffected by changes to the

physical aspects of storage, such as change to new storage device.

Page 15: LECTURE2: DATABASE ENVIRONMENT · several reasons why this separation is desirable: ... • To build and validate logical data model ... • Step 2.1 Derive relations for logical

DATA INDEPENDENCE

Lecture2

15

Page 16: LECTURE2: DATABASE ENVIRONMENT · several reasons why this separation is desirable: ... • To build and validate logical data model ... • Step 2.1 Derive relations for logical

Data Independence

Lecture2

16

The three-level architecture provides Data Independence,

which means that upper level are unaffected by changes to

lower level

Data Independence is the ability to modify a schema

definition in one level without affecting a schema definition

in the next higher level.

There are two kinds of data independence:

Logical Data Independence

Physical Data Independence

Page 17: LECTURE2: DATABASE ENVIRONMENT · several reasons why this separation is desirable: ... • To build and validate logical data model ... • Step 2.1 Derive relations for logical

Data Independence

Lecture2

17

Logical Data Independence

Refers to immunity of external schemas to changes in

conceptual schema.

Conceptual schema changes (e.g. addition/removal of

entities) should not require changes to external schema or

rewrites of application programs.

Page 18: LECTURE2: DATABASE ENVIRONMENT · several reasons why this separation is desirable: ... • To build and validate logical data model ... • Step 2.1 Derive relations for logical

Data Independence

Lecture2

18

Physical Data Independence

Refers to immunity of conceptual schema to changes in the

internal schema.

Internal schema changes (e.g. using different file

organizations, storage structures/devices) should not require

change to conceptual or external schemas.

Page 19: LECTURE2: DATABASE ENVIRONMENT · several reasons why this separation is desirable: ... • To build and validate logical data model ... • Step 2.1 Derive relations for logical

Data Independence and the Three-Level

Architecture

Lecture2

19

Page 20: LECTURE2: DATABASE ENVIRONMENT · several reasons why this separation is desirable: ... • To build and validate logical data model ... • Step 2.1 Derive relations for logical

Database Language

Lecture2

20

Data Definition Language (DDL) and a Data Manipulation Language

(DML). The DDL is used to specify the database schema and the DML is

used to both read and update the database.

Page 21: LECTURE2: DATABASE ENVIRONMENT · several reasons why this separation is desirable: ... • To build and validate logical data model ... • Step 2.1 Derive relations for logical

Lecture2 21

Page 22: LECTURE2: DATABASE ENVIRONMENT · several reasons why this separation is desirable: ... • To build and validate logical data model ... • Step 2.1 Derive relations for logical

DATA MODELS

Lecture2

22

Page 23: LECTURE2: DATABASE ENVIRONMENT · several reasons why this separation is desirable: ... • To build and validate logical data model ... • Step 2.1 Derive relations for logical

Data Model

Lecture2

23

• A data model—a collection of concepts that can be used to describe the

structure of a database.

• By structure of a database we mean the data types, relationships, and

constraints that apply to the data.

• Purpose

• To represent data in an understandable way.

Page 24: LECTURE2: DATABASE ENVIRONMENT · several reasons why this separation is desirable: ... • To build and validate logical data model ... • Step 2.1 Derive relations for logical

Database system development lifecycle

Lecture2

24

As a database system is a fundamental component of the larger

organization-wide information system, the database system

development lifecycle is inherently associated with the lifecycle of the

information system.

The stages of the database system development lifecycle are shown

in the following Figure:

Page 26: LECTURE2: DATABASE ENVIRONMENT · several reasons why this separation is desirable: ... • To build and validate logical data model ... • Step 2.1 Derive relations for logical

Database Design

Lecture2

26

Database design has three main phases: conceptual, logical,

and physical design.

• Conceptual database design – to build the conceptual representation of

the database, which includes identification of the important entities,

relationships, and attributes.

• Logical database design – to translate the conceptual representation to

the logical structure of the database, which includes designing the

relations.

• Physical database design – to decide how the logical structure is to be

physically implemented (as base relations) in the target Database

Management System (DBMS).

Page 27: LECTURE2: DATABASE ENVIRONMENT · several reasons why this separation is desirable: ... • To build and validate logical data model ... • Step 2.1 Derive relations for logical

Conceptual Data Model

Lecture2

27

Conceptual Database Design: The process of constructing a model of the data used in an enterprise, independent of all physical considerations.

The conceptual data model includes ER and a data dictionary.

To build conceptual data model:

Step 1.1 Identify entity types

Step 1.2 Identify relationship types

Step 1.3 Identify and associate attributes with entity or relationship types

Step 1.4 Determine attribute domains

Step 1.5 Determine candidate, primary, and alternate key attributes

Step 1.6 Check model for redundancy

Step 1.7 Validate conceptual model against user transactions

Step 1.8 Review conceptual data model with user

Page 28: LECTURE2: DATABASE ENVIRONMENT · several reasons why this separation is desirable: ... • To build and validate logical data model ... • Step 2.1 Derive relations for logical

Logical Data Model

Lecture2

28

• Logical Database Design: The process of constructing a model of the

data used in an enterprise based on a specific data model (e.g.

relational), but independent of a particular DBMS and other physical

considerations.

• To build and validate logical data model (for the relational model):

• Step 2.1 Derive relations for logical data model

• Step 2.2 Validate relations using normalization: The process of organizing

data to minimize redundancy such as dividing large tables into smaller (and

less redundant) tables and defining relationships between them

• Step 2.3 Validate relations against user transactions

• Step 2.4 Check integrity constraints

• Step 2.5 Review logical data model with user

• Step 2.6 Check for future growth

Page 29: LECTURE2: DATABASE ENVIRONMENT · several reasons why this separation is desirable: ... • To build and validate logical data model ... • Step 2.1 Derive relations for logical

Physical Data Model

Lecture2

29

• Physical Database Design : The process of producing a description of the implementation of the database on secondary storage.

• The physical database design phase allows the designer to make decisions on how the database is to be implemented.

• Therefore, physical design is tailored to a specific DBMS

• To build physical data model:

• Step 3.1 Translate logical data model for target DBMS

• Step 3.2 Design file organizations and indexes

• Step 3.3 Design user views

• Step 3.4 Design security mechanisms

• Step 3.5 Denormalization and controlled redundancy: The process of attempting to optimise the read performance of a database Such as adding attributes to a relation from another relation with which it will be joined.

• Step 3.6 Monitor and tune the operational system

Page 30: LECTURE2: DATABASE ENVIRONMENT · several reasons why this separation is desirable: ... • To build and validate logical data model ... • Step 2.1 Derive relations for logical

Lecture2 30

Page 31: LECTURE2: DATABASE ENVIRONMENT · several reasons why this separation is desirable: ... • To build and validate logical data model ... • Step 2.1 Derive relations for logical

CLASSIFICATION OF DBMS’S

Lecture2

31

Page 32: LECTURE2: DATABASE ENVIRONMENT · several reasons why this separation is desirable: ... • To build and validate logical data model ... • Step 2.1 Derive relations for logical

Classification or models of DBMSs 32

1. First generation

• Network, Hierarchical

2. Second generation

• Relational

3. Third generation

• Object-oriented, Object-relational

Page 33: LECTURE2: DATABASE ENVIRONMENT · several reasons why this separation is desirable: ... • To build and validate logical data model ... • Step 2.1 Derive relations for logical

Network Data Model

• The model that allowing a record to participate in multiple

parent/child relationships.

• Allowing child records to have multiple parents (M:N

relationships).

Hierarchical Data Model

• Each parent record can have many children, but each child

record has only one parent (1:M relationships).

• Tree-like structure.

First Generation 33

Page 34: LECTURE2: DATABASE ENVIRONMENT · several reasons why this separation is desirable: ... • To build and validate logical data model ... • Step 2.1 Derive relations for logical

Disadvantages of hierarchical and network DBMSs:

1. Required complex programs for even simple queries.

2. Minimal data independence.

3. No widely accepted theoretical foundation.

First Generation 34

Page 35: LECTURE2: DATABASE ENVIRONMENT · several reasons why this separation is desirable: ... • To build and validate logical data model ... • Step 2.1 Derive relations for logical

Second Generation 35

Relational Data Model:

Computer database in which all data is stored in Relations

which are tables with rows and columns.

Each table is composed of records (called Tuples) and

each record is identified by a field (attribute containing a

unique value).

Page 36: LECTURE2: DATABASE ENVIRONMENT · several reasons why this separation is desirable: ... • To build and validate logical data model ... • Step 2.1 Derive relations for logical

Advantages of Relational model 36

The benefits of a database that has been designed according to

the relational model are numerous. Some of them are:

1. Data entry, updates and deletions will be efficient.

2. Data retrieval, summarization and reporting will also be

efficient.

3. Since much of the information is stored in the database

rather than in the application, the database is somewhat self-

documenting.

4. Changes to the database schema are easy to make.

Page 37: LECTURE2: DATABASE ENVIRONMENT · several reasons why this separation is desirable: ... • To build and validate logical data model ... • Step 2.1 Derive relations for logical

Third Generation 37

Object-oriented Data Model

• Response to increasing complexity of DB applications

Page 38: LECTURE2: DATABASE ENVIRONMENT · several reasons why this separation is desirable: ... • To build and validate logical data model ... • Step 2.1 Derive relations for logical

Lecture2 38