DBMS

134
UNIT -I 1.Data base system Database system is nothing more than a computer-based record keeping system (i.e.) a system whose overall purpose is to record and maintain information. The information concerned can be anything that is deemed to be of significance to the organization or the system which may serve the organization in decision-making processes involved in the management of that organization. The database system involves four major componenets.They are data ,hardware, software and users. Database Management System User1 User User Application programs End Users Fig: Simplified picture of a database system Data The data stored in the system is partitioned into one or more databases. A database is a repository for stored data, it is of both integrated and shared. Integrated: By integrated we mean that the database can be thought of as a unification of several 1

Transcript of DBMS

Page 1: DBMS

UNIT -I1.Data base system

Database system is nothing more than a computer-based record keeping system (i.e.) a system whose overall purpose is to record and maintain information. The information concerned can be anything that is deemed to be of significance to the organization or the system which may serve the organization in decision-making processes involved in the management of that organization.

The database system involves four major componenets.They are data ,hardware, software and users.

Database Management System

User1

User

User

Application programs End UsersFig: Simplified picture of a database system

Data The data stored in the system is partitioned into one or more databases. A database is a repository for stored data, it is of both integrated and shared. Integrated: By integrated we mean that the database can be thought of as a unification of several distinct files, with the redundancy among those files eliminated. Example: Combination of EMPLOYEE and ENROLLMENT data files. Shared: By Shared we mean that individual pieces of data in the database can be shared among different users that is many users can have access to the same piece of data. Example: The department information in EMPLOYEE file would be shared by users in the personal department, education department etc.

1

Page 2: DBMS

Hardware

The hardware consists of the secondary storage device disks, drums,etc… on which the database resides together with the associated devices, control units, channels and so forth.

Software

Between the physical database and the users of the system is a layer of software usually called the DBMS.All requests from users for access to the database are handled by the DBMS.One general function provided by the DBMS is thus the shielding of the database users from hardware level. The DBMS provides a view of the database that is elevated somewhat above the hardware level and supports user operation that are expressed in terms of that higher-level view.

Users

We consider three broad categories of database users, they are *application programmers *end-users *DBA

1.Application programmers Application programmer is responsible for writing application programs that use the database. These application programs operate on the data in all the usual ways that is in retrieving information, creating new information, deleting or changing existing information.

2.End-users End-users access the database from a terminal. An end-user may employ a query language provided as an integral part of the system or may invoke a user-written application program that accepts commands from the terminal and in turn issues requests to the DBMS on the end-user’s behalf.

3.Database Administrator DBMS have central control of both the data and to the programs that access those data. The person who has such control over the system is called DBA.The main functions of DBA are *Schema definition *Storage structure and access-method definition *Granting and physical-organization modification *Integrity-constraint specification

These are the various components of a database system.

2

Page 3: DBMS

2.Operational data

A database is a collection of stored operational data used by the application systems of some particular enterprise. Where enterprise is a conventional generic term for any reasonably self-contained commercial, scientific, technical or other organization. Examples. Manufacturing company,Bank,Hospital,University,Government department etc. The enterprise should maintain a lot of data about its operation. The “operational data” for the enterprises quoted above are, Product data, account data, patient data, student data, planning data.

Example for the illustration of operational data

Consider the manufacturing company where the enterprise will wish to retain information about the projects it has on hand; parts used in those projects; the suppliers who supply the parts; the warehouses in which the parts are stored; the employees who work on the projects etc..These are the basic entities about which data is recorded in the database. In general there will be associations or relationships linking the basic entities together(entity is any distinguishable object).

For example, there is an association between suppliers and parts that is each supplier supplies certain parts and conversely each part is supplied by certain suppliers etc..

Fig: An example of operational dataThe figure illustrates

1.Most of the associations are between two entities or more than that ex., arrow connecting suppliers-parts-projects Here supplier s2 supplies part p4 to project j3.

3

suppliersprojects

warehouses parts

employees

locations departments

Page 4: DBMS

2.The example also shows one arrow involving only in one type of entity (parts) ex., some parts are components of other parts (a screw is a component of a huge assembly or char etc..) 3.Some entities may be associated in more than one relationship Ex., projects and employees are linked in two relationships a. the employee works on the project b .the employee is the manager of the project This example clearly illustrates operational data and its functions.

3.Data Independence

The ability to modify a schema definition in one level without affecting a schema in the next higher level is called data independence. Most present day applications are data-dependent. This means ,the way in which the data is organized in secondary storage and the way in which it is accessed are both dictated by the requirements of the application ,and moreover that knowledge of the data organization and access technique is built into the application logic. For example, if a file is stored in indexed sequential form, and in order to modify the file the indexes defined should be known. Here the data is dependent, and the modification requires complete application program to be rewritten. In database system, data resides independent and any modification done at physical level/conceptual level may not affect the database system.

Two types of data independence stated are

1.Physical data independence

Physical data independence is the ability to modify the physical schema without causing application programs to be rewritten. Modifications at the physical level are occasionally necessary to improve performance. Example, Modifying the structure of the database using ALTER command etc.

2.Logical data independence

Logical data independence is the ability to modify the logical schema without causing the application programs to be rewritten.

Example, Modifications such as adding new columns or field to the database. Most of the modifications are done by the DBA and the types of change that the

DBA wish to make may be explained with the help of the following definitions:

4

Page 5: DBMS

Stored field: Stored field is the smallest unit of data stored in the database.

Ex., database containing information about parts would probably include a stored field type called part number etc.

Stored record: Stored record is a named collection of associated stored fields. Stored file: Stored file is the collection of all occurrences of one type of stored record. Similarly if a data type of the stored field has to be changed is also done by Data. The data storage may be in any of the following form.

1.Representation of numeric data Data may be stored in internal arithmetic form or as a character string.

2.Representation of character data A character field may be stored in any of several character codes (eg.EBCDIC,ASCII..)

3.Units for numeric data The units in a numeric field may change.Ex.,from inches to centimeters 4.Data coding In some situations it may be desirable to represent data in storage by coded values. Ex., the value for part color=RED can be interpreted as 1=’RED’.

5.Structure of stored records Two existing types of stored record may be combined into one. For ex., the record types(part number, color) and (part number, weight) may be integrated to give (part number,color,weight). Also a single type of stored record may be split into two. For ex.,(part number,color,weight) may be broken down into (part number, color) and (part number, weight). 6.Structure of stored fields A given stored file may be physically implemented in storage in a wide variety of ways. For ex., storing the file in single storage volume or spread across several volumes. The above fact implies that the database is able to grow without affecting existing applications.

5

Page 6: DBMS

4.Architecture for a Database system

The architecture is divided into three general levels, they are internal,conceptual,external levels,

------------------- - External level (individual user

views)

Conceptual level (Community user view)

Internal level(Storage view)

Fig:Three levels of architecture

*Internal level(Physical level)

This level is the one closest to the physical storage .This is a low-level representation of the entire database; it consists of many occurrences of each of many types of internal record .The storage view is described by means of the internal schema which not only defines the various stored record types but also specifies what indexes exist, how stored files are represented ,what physical sequence the stored records are in and so on.

*Conceptual level (Community logical level) This level is the representation of the entire information content of the database. It consists of many occurrences of each of many types of conceptual record. Also this is a level of indirection between the other two levels.

*External level(user logical level) This level is closest to the users and is concerned with the way the data is seen by the individual users. The users may be application programmers,end-users,DBA etc.Each user has a language at his/her disposal to interact with the database. For the application programmer the language will be either a conventional programming like c++,JAVA etc. For end users the language will be either a query language or some special-purpose language and that language is data sub language (DSL) which is a subset of the total language that is concerned with database objects and operations. The DSL is embedded within the corresponding host language . A given system might support any number of host languages and any number of data sub languages; however, one particular data sub language that is supported by almost all current systems is the language SQL.

6

Page 7: DBMS

Any given data sub language is a combination of at least two subordinate languages-a Data definition language(DDL) and data manipulation language(DML).Where the DDL portion consists of declarative constructs and the DML portion consists of executable statements. The individual user will generally be interested only in some portion of the total database; moreover ,that user’s view of that portion will generally be somewhat abstract when compared with the way the data is physically stored. The term for an individual user’s view is an external view. An external view is thus the content of the database as seen by some particular user.

For example, A user from the Personnel Department might view the details of employee and department and nothing else.

Detailed System architectureUser A1 user A2 User B1 User B2

*external *external schema A schema B

External/conceptual External/conceptualmapping A mapping B

conceptual schema

Conceptual/internalmapping

storage structure definition

(internal schema)

fig: Database system architecture

7

Host language+DSL

Host language+DSL

Host language+DSL

Host language+DSL

External view A External view B

Conceptual view

Database management system(DBMS)

Stored database(internal level)

*user interface

Page 8: DBMS

Mappings

The mappings involved in the architecture are conceptual/internal mapping and external/conceptual mappings.The conceptual/internal mapping defines the correspondence between the conceptual view and stored database, it specifies how conceptual records and fields are represented at the internal level. If the structure of the stored database is changed then the conceptual/internal mapping must be changed accordingly, so that the conceptual schema can remain invariant. The effects of such changes must be isolated below the conceptual level, in order to preserve physical data independence.The external/conceptual mapping defines the correspondence between a particular external view and the conceptual view.

Database administrator(DBA)

The Data Administrator(DA) is the person who makes the strategic and policy decisions regarding the data of the enterprise and the DBA is the person who provides the necessary technical support fro implementing those decisions. Thus the DBA is responsible for the overall control of the system in technical level. The major tasks of DBA are *defining the conceptual schema or schema definition *storage structures and access-method definition *schema and physical organization modification *granting of authorization for data access *integrity constraint specification

DBMS

The DBMS is the software that handles all access to the database. Its functions are as follows

A user issues an access request using some particular data sub language The DBMS intercepts that request and analyses it. The DBMS inturn,intercepts the external schema for that user, the corresponding

external/conceptual mapping, the conceptual schema, the conceptual/internal mapping, the storage structure definition.

The DBMS executes the necessary operations on the stored database

8

Page 9: DBMS

The diagrammatic representation of the major functions of DBMS and its components.

Enforce security and Integrity constraints

9

Source schemas and mappings Planned DML

requestsUnplanned DML requests

DDL processors DML processor Query language processor

Compiled requests

optimizer

Optimized requests

Run time manager

database

Metadata (data dictionary)

Meta data

Source and object schemas and mappings

Page 10: DBMS

5.Distributed databases

The key objective of distributed system is that it should look like a centralized system to the users. Distributed processing means that distinct machines can be connected together into communication network such as the Internet, so that the single data-processing task can span several machines in the network. A distributed database is typically a database that is not stored in its entirety at a single physical location, but rather is spread across a network of computers that are geographically dispersed and connected through communication links. For example, consider a banking system in which the customer accounts database is distributed across the bank branch offices, such that each individual customer account record is stored at the customer’s local branch. It other words the data is stored at the location at which it is frequently used, but is still available through communication network to users at other locations for example, users at the bank’s central office.

D database

Advantages

Efficiency of local processing

10

Communication network

ClientServer

ClientServer

ServerClient

ServerClient

Page 11: DBMS

Data sharing

Disadvantages

Overhead may be quite high Technical difficulties

6.Storage structures and its purposes. The main idea behind data maintenance is for future reference and it has to be stored for the storage and access of data ,various techniques like sequential ,direct access etc. exists. Once the data is stored in the memory in internal level(physical storage) then it is accessed through DML operations in terms of external records and must be converted in turn to operations at the actual hardware level that is to operations on physical records or blocks. The component responsible for this internal/physical conversion is called an access method. The access method consists of a set of routines whose function is to conceal all device-dependent details from the DBMS and to present the DBMS with a stored record interface.

user interface

External record

occurrences Stored record interface

Stored recordoccurrences Physical record interface

physical record occurrences

Fig: The stored record interface

The stored record interface thus corresponds to the internal level, just as the user interface corresponds to the external level. Also the stored record interface allows DBMS to view the storage structure as a collection of stored files each consisting of all occurrences of one type of stored record. The DBMS knows *What stored files exist *The structure of the corresponding stored record

11

USER

DBMS

Access Method

Page 12: DBMS

*The stored fields on which it is sequenced *The stored field which can be used for direct access etc. These information will be specified as part of the storage structure definition.The DBMS does not know a)anything about physical records b)how sequencing is performed c)how direct access is performed These information are specified to the access method not to the DBMS.

Also ,when a new stored record occurrence is first created and entered into the database, the access method is responsible for assigning it a unique stored record address(SRA).This value distinguishes each stored records from other records, the SRA for a particular occurrence is returned to the DBMS by the access method when the occurrence is first created and may be used by the DBMS for subsequent direct access to the occurrence concerned. The SRA for a given occurrence does not change until the occurrence is physically moved as part of a database reorganization.

7.How data are stored in the physical storage?

There are various possible representations of data within the memory and some of them are explained here. Consider the following example.

The table consists of information about five suppliers for each supplier a record number ,a supplier name, a status value and a location is recorded. Also the supplier number for each supplier is unique, that is each record is sequenced on the basis of its primary key. The above example is the simplest from of data representation containing only five record occurrences with unique supplier number. If the suppliers are 10000 rather than five and located in only 10 different cities then the storage will be wasted specifying the 10 cities among 10000 suppliers. Then the pointer is specified from the supplier file to the city file by separating the city attribute alone to a file.

The following is another form of data the representation

Supplier file city file

S# Sname Status CityS1 Smith 20 LondonS2 Jones 10 ParisS3 Blake 30 ParisS4 Clark 20 LondonS5 Adams 30 Athens

12

Page 13: DBMS

CityAthensLondonParis

In the above figure the pointers exists from supplier file to the city file and they are SRAs(Storage record address).Advantage of this form of representation over the previous one is, in the later memory space is saved.

The third form of data representation is indexing. If a file is indexed on any of its attributes(more frequently occurring) then accessing such file is quite easier. The representation can be

S# Sname StatusS1 Smith 20S2 Jones 10S3 Blake 30S4 Clark 20S5 Adams 30

indexed on city

An example,”Find all suppliers in a given city”,when this query is placed then the result is retrieved quite easily from the database if represented as above that is in indexed form.

The purpose of indexing is to provide an access path to the file.An index is a file in which each entry(record) consists of a data value together with one or more pointers.The data value is a value for some field of the indexed file and the pointers identify records in the indexed file having that value for that field.An index can be used in two ways first it

S# Sname Status City-ptrS1 Smith 20S2 Jones 10S3 Blake 30S4 Clark 20S5 Adams 30

City Supplier ptrAthensLondon

paris

13

Page 14: DBMS

is used for sequential access to the indexed file and another is used for direct access to individual records in the indexed file on the basis of a given value for that same field. The another form of dat representation is multilist organisation.

8.DATA STRUCTURES AND CORRESPONDING OPERATORS

The range of data structures supported at the user level is a factor that critically affects many componenets of the system .It dictates the design of the corresponding data manipulation languages,since DML operation must be defined in terms of its effect on those datastructures.We may categorize database systems according to the approach and the best known approaches are

Relational approach Hierarchical approach Network approach

The relational approach

The relational approach uses a collection of tables to represent both data and the relationships among those data. Each table has multiple columns and each column has a unique name.

Sample relational database

Bank customerCustomer name Snsocial-security-no. customer-street customer-city account-no.JohnsonSmithHayesTurnerJohnsonJonesLindsaySmith

92-83-7465019-28-3746677-28-9011182-73-6091192-83-7465321-12-3123336-66-9999019-28-3746

AlmaNorthMainPutnamAlmaMainParkNorth

Palo AltoRyeHarrisonStamfordPalo AltoHarrisonPits fieldRye

A-101A-215A-102A-305A-201A-217A-222A-201

Accounts

account-no balanceA-101A-215

500700

14

Page 15: DBMS

A-102A-305A-201A-217A-222

400350900750700

For example, customer Johnson whose social-security-no. is 192-83-7465 lives on Alma in Palo Alto and has 2 accounts A-101 with balance 500,a-201 with balance 900.Also smith and Jhonson shares A-201 account.

Network model Data in the network model are represented by collections of records and relationships among data .The relationships among data can be represented by links, which can be viewed as pointers

Sample network databases

Hierarchical Model

This form of data representation is similar to network model in the sense that records represent data and relationships among data and links .It differs from the network model in that the records are organized as collection of trees rather than graphs.

Johnson 192-83-7465 Alma Palo Alto A-101 500

A-215 700Smith 019-28-3746 North Rye

15

Page 16: DBMS

9.Advantages of using DBMS

Many enterprises choose to store its operational data in an integrated database because it provides the enterprise with centralized control of its operational data, which is most valuable.

DBA has the central responsibility over operational data.Advantages if data is stored under centralized control.

1.Redundancy can be reduced In non-database system each application has its own private files-which may cause redundancy in stored data. By means of integration this can be avoided.

2.Inconsistency can be avoided (to some extent) Suppose the fact, Employee E3 works in department D8 is represented by two distinct entries in the database and the system is not aware of this duplication. And if any one alone is updated in some occasions they will not agree and comes inconsistent state. So if the redundancy is controlled then the system could guarantee that the database is never inconsistent as seen by the user, by ensuring that any change made to either of two entries is automatically made to each other. This process is known as propagating updates.

3.The data can be shared New applications can access the stored databases.

4.Security restrictions can be applied. Only if permissions are available all users could access the database. The permissions are given by the DBA, so the data ensures security.

5.Integrity can be maintained Data in the database is accurate or not is mostly validated.

10.Database Administrator

One of the main reasons for using DBMS is to have central control of both the data and the programs that access those data. The person who has such central control

16

Page 17: DBMS

over the system is called the database administrator (DBA). The functions of the DBA include the following.

Schema definition: The DBA creates the original database schema by writing a set of definitions that is translated by DDL compiler to a set of tables that is stored permanently in the data dictionary.

Storage structure and access-method definition: The DBA creates appropriate storage structures and access methods by writing a set of definitions, which is translated by the data-storage and data-definition-language compiler.

Schema and physical-organization modification: Programmers accomplish the relatively rare modifications either to the database schema or to the description of the physical storage organization by writing a set of definitions that is used by either the DDL compiler or the data-storage and data-definition language.

Granting of authorization for data acess: Granting of different types of authorization allows the DBA to regulate which parts of the database various users can access.

Integrity – constraint specification: Setting constraints (conditions) while entering data to the database .For ex, the minimum balance in the account should be at least 500 etc.

17

Page 18: DBMS

DATABASE MANAGEMENT SYSTEM UNIT IObjective questions

1.Database is a) Computer-based billing system b) Computer-based record keeping system c) Computer-based animation system2.The software used for access to the database is a) BASIC b) PASCAL c) DBMS3.The end-users access the database from the terminal using a) Query language b) English language c) C language 4.DBA stands for a) Data Base Administrator b) Data base Access c) Data Batch Administration5.Which of the following is not operational data a) Product data b) Account data c) two numbers6.The database system provides the enterprise with ___________ control of its operational data a) Centralized b) Single c) Shared7.The ability to modify the schema definition in one level without affecting the schema in the other level is called a) Data dependence b) data independence c) data abstraction8.Which of the following is not a level of database architecture a) External b) logical c) super d) conceptual9.Data sub language is a combination of a) DDL and DML b) DDL and TCL c) C and C++10.A database that is not stored in a single physical location in its entirety and spread across the network is a) Centralized database b) Distributed database c) Shared database11.DBMS is a) A software that handles all access to the database b) A hardware c) An interface between end-user and computer 12.The component responsible for internal/physical conversion is called a) Access method b) internal conversion c) a hardware13. SRA is a) Stored Record Array b) Stored Record Access c) Stored Record Address14.Primary key is the key which

18

Page 19: DBMS

a) Avoids duplication of data b) supports duplication of data c) allows null values15.The data is represented in terms of 1) Relational approach 2) hierarchical approach 3) network approach a) 1,2 b) 1,2,3 c) none of the above

16.The representation of data in relational approach 1) Tables 2) tuples 3) relations Ans: a) 1 b) 1,2 c) 1,2,3 d) none17.The data represented in network approach is through a) Records and links b) tables c) trees18.The ___________permits the DBMS to view the storage structure as a collection of stored files. a) Stored record interface b) Stored record address c) Access method19.Entity is a) Any distinguishable real world object b) Not an object c) Incident20.DBMS stands for a) Data Base Management System b) Database Multimedia system c) Data Base Management Standards

Short questions

1.What are the basic components of database system?2.Explain the components of a database system with the simplified diagram.3.What is an operational data?4.Explain operational data with example.5.Explain data independence.6.Why database systems is adopted rather than filesystem or write down the advantages of database system.7.Distinguish between input, output, and operational data8.Explain three levels of database system in brief.9.What is the role of DBA?10.What are the functions of DBMS?11.Explain in brief distributed databases.12.Relate distributed databases with client server architecture.13.Explain access method, SRA, SRI.14.Differentiate relational, network, hierarchical approaches.15.Explain any one form of data representation.

Elaborate questions

19

Page 20: DBMS

1.Role of DBA with any one-function explanation in detail2.DBMS and its functions, advantages, disadvantages3Database system is followed now-a-days. Justify4.Explain the architecture of database system.5.Explain database system with simplified structure.6.Explain storage structures with at least any one representation.7.Explain various data structures used to represent data in database system.

Course : B.Com CA

Semester : III

Subject : Data Base Management System

Unit : Two

Unit II

Syllabus

Relational approach: Relational data structure: relation, domain, attributes, keysRelational algebra: Introduction, traditional set operation, attribute names for derived relations, special relational operations.

Books for Reference: Database system Concepts - Abraham silberschatz, Henry F.Korth, S.Sudharsan

An introduction to database system - C.J.Date

Principles of database system -Aho D.Ullman

An introduction to database systems -Bipin P.Desai

Relational ApproachIntroduction:

20

Page 21: DBMS

The relational model has established itself as the primary data model for commercial data-processing applications. The first database systems were based on either the network model or the hierarchical model. The relational model is now being used in numerous applications outside the domain of traditional data processing.

Structure of relational databases.

A relational database consists of a collection of tables, each of which is assigned a unique name. A row in a table represents a relationship among a set of values. The rows are termed as tuples and columns are termed as attributes. Since a table is a collection of such relationships, there is a close correspondence between the concept of table and the mathematical concept relation, from which the relational data model takes its name.

The following account table or relation has three column headers: branch-name, account-number and balance. These are the attributes (columns are referred as attributes). For each attribute there is a set of permitted values, called the domain of that attribute. For the attribute, branch-name set of all branch-names is its domain.

The account relation

Let D1 denote the set of all branch-names, D2 denote the set of all account-numbers, and D3 the set of all balances. In the account relation it consists of a 3-tuple (v1, v2, v3), were v1 is a branch name, v2 is an account number and v3 is a balance. The account will contain only a subset of the set of all possible rows. It can be represented as D1 * 2 * D3 In general a table of n attributes must be a subset of D1 * D2 *……Dn-1 * D n

The relation is said to be a subset of a Cartesian product of a list of domains. Tables are relations and the mathematical terms relation and tuple is used for the terms table and row respectively. In the account relation of the above figure there are seven tuples. Let the tuple variable t refer to the first tuple of the relation .We

Branch-name Account-number BalanceDowntownMianusPerry ridgeRound HillBrightonRedwoodBrighton

A-101A-215A-102A-305A-201A-222A-217

500700400350900700750

21

Page 22: DBMS

use the notation t [branch-name] to denote the value of t on the branch-name attribute. Thus, t [branch-name]=”Downtown”, and t [balance]=500.Since the relation is a set of tuples, we use the mathematical notation of t E r to denote that tuple r is in relation r.

Domain: -Domain is a pool of values. Also we can say that domain is atomic if elements of the domain are considered to be individual units. For example, the set of integers is a nonatomic domain. The distinction is that we do not normally consider integers to have subparts, but we consider sets of integers to have subparts-namely, the integers comprising the set. It is possible for several attributes to have the same domain.

The customer relation

It is possible for several attributes to have the same domain. For example, suppose that we have a relation customer that has the three-attribute customer-name, customer-street and customer-city, and a relation employee that includes the attribute employee-name. It is possible that the attributes customer-name and employee-name will have the same domain: the set of all person names. The domains of balance and branch-name are certainly distinct. It is perhaps less clear whether customer-name and branch-name should have the same domain. At the physical level, both customer names and branch-names are character strings. However, at the logical level, we may want customer-name and branch-name to have distinct domains.

Customer-name

Customer-street Customer-city

JonesSmithHayesCurryLindsayTurnerWilliamsAdamsJohnsonGlennBrooksGreen

MainNorthMainNorthParkPutnamNassauSpringAlmaSand HillSenatorWalnut

HarrisonRyeHarrisonRyePittsfieldStamfordPrincetonPittsfieldPalo AltoWoodsideBrooklynStamford

22

Page 23: DBMS

Relation:

Definition for relation (mathematically): Given a collection of set D1, D2,……Dn (not necessarily distinct,R is a relation on those n sets if it is a set of ordered n-tuples <d1,d2,……dn> such that d1 belongs to D1,d2 belongs to D2 ,…..dn belongs to Dn.Set D1,D2,D3,…..Dn are the domains of R.The value of n is the degree of R.

The concepts of relation correspond to the programming-language notion of a variable. The concept of a relation schema corresponds to the programming-language notion of type definition. It is convenient to give a name to a relation schema, just as we give names to type definitions in programming languages. We adopt the convention of using lowercase names for relations, and names beginning with an uppercase letter for relation schemas. For example,

Account-schema=(branch-name, account-number, balance)

The explanation of relation can be expressed diagrammatically with the help of E-R diagrams. Before discussing E-R diagrams, the common terms used in the diagrams is analysed.

Entity: This is a thing or object in the real world that is distinguishable from all other objects. For example, each person in an enterprise is an entity. An entity has a set of properties, and the values for some set of properties may uniquely identify entity. For example, the social-security number 677-89-9011(employee number 1111) uniquely identifies one particular person in the enterprise.

Entity Set: An entity set is a set of entities of the same type that share the same properties or attributes. The set of all persons who are customers at a given bank, for example, can be defined as the entity set customer.

Attributes: An entity is represented by a set of attributes. Attributes are descriptive properties possessed by each member of an entity set. Possible attributes of customer entity are customer-number, customer-street, and customer-city. The following attribute types, as used in the E-r model, can characterize an attribute.

Simple and Composite attributes : The attributes, which can be divided into subparts, are composite attribute. For example, name is an attribute, which is combination of first-name, middle name, and last-name.

Single-valued and Multivalued attributes : The attributes that we have specified in our examples all have a single value for a particular entity. For instance, the loan-number attribute for a specific loan entity refers to only one loan number. Such attributes are said to be single valued. There

23

Page 24: DBMS

may be instances where an attribute has a set of values for a specific entity.

Null attributes : A null value is used when an entity does not have a value for an attribute.

Derived attribute: The value for this type of attribute can be derived from the values of other related attributes or entities. For instance, let us say that the customer entity set has an attribute loans-held, which represents how many loan a customer entity set has from the bank. We can derive the value for this attribute by counting the number of loan entities associated with that customer.

Relationship sets Consider the relation loan. Branch-name Loan-number AmountDowntownRedwoodPerry ridgeDowntownMianusRound HillPerry ridge

L-17L-23L-15L-14L-93L-11L-16

10002000150015005009001300

A relationship is an association among several entities. For example, we can define a relationship that associates customer Hayes with loan number L-15.This relationship specifies that Hayes is a customer with loan number L-15.

A relationship set is a set of relationships of the same type.Formally.it is a mathematical relation on n>=2 (possibly non distinct) entity sets. If E1, E2,…..En are entity sets, then a relationship set R is a subset of {(e1, e2,…………..,en)|e1 E1,e2 E2 ,…..en En} Where (e1, e2,…….en) is a relationship.

Consider the two entity sets customer and loan, we can define the relationship set borrower to denote the association between customers and the bank loans that the customers have. As another example, consider the two-entity sets loan and branch. We can define the relationship set loan-branch to denote the association between a bank loan and the branch in which that loan is maintained.

24

Page 25: DBMS

Each row of the table represents one n-tuple of the relation. The number of tuples in the relation is called the cardinality of the relation. Eg. The cardinality of the relation loan is 7.

The relations may be unary, binary, ternary, n-ary etc.

Unary: Relations of degree one is unary.

For ex, the query Find the branch name that issued loan with number L-17.The output will be

Branch-nameDowntown

Binary: Relations of degree two are binary.

Ex, Find branch-name and amount for loan-number L-17 from branch relationThe output will be,

Branch-name AmountDowntown 1000

Ternary: Relations of degree three are ternary

N-ary: Relations of degree n are n-ary.

Mapping cardinalities: Mapping cardinalities, or cardinality ratios, express the number of entities to which another entity can be associated via relationship set. Mapping cardinalities are most useful in describing binary relationship sets, although occasionally they contribute to the description of relationship sets that involve more than two entity sets. For binary relationship set R between sets A and B, the mapping cardinality must be one of the following:

One to one: An entity is associated with at most one entity in B, and an entity in B is associated with at most one entity in A.

One to Many: An entity in A is associated with any number of entities in B.An entity in B, however, can be associated with at most one entity in A.

Many to one: An entity in A is associated with at most one entity in B.An entity in B, however, can be associated with any number of entities in A.

25

Page 26: DBMS

Many to Many: An entity in A is associated with any number of entities in B, and an entity in B is associated with any number of entities in A.

Keys:

In a relation there is one attribute whose values is unique within the relation and thus can be used to identify the tuples of that relation.

For ex, in the above said loan relation the loan number can be considered as a key, which is unique, and can be used to distinguish all other tuples in that relation. Befrore discussing on various keys let us have a glance on integrity constraints.

Integrity constraints:

An integrity constraint is a mechanism used by oracle to prevent invalid data entry into the table. It is nothing but enforcing rule for the coloumn in a table. The following are the various types of integrity constraints: -

*Domain integrity constraints

Maintains value according to the specification like ‘not null’ condition, so that the user has to enter a value for the coloumn on which it is specified. ‘Not null’ and ‘Check’ constraints fall unde this category.

*Entity integrity constraint

Maintains uniqueness in a record.

*Referential integrity constraint

Enforces relationship between tables

To establish a ‘parent-child’ or a ‘master-detail’ relationship between two tables having a common column we make use of referential integrity constraints. To implement this we should define the column in the parent table as a primary key and the same column in the child table as a foreign key referring to the corresponding parent entry. We define constraint to either at table or column level. If it is defined at the table level, then it can be enforced to any number of columns in a table .On other hand, if it is defined at the column level then it holds good only for the column for which it is defined.

Various keys related to relational approaches are

26

Page 27: DBMS

Primary Key: Primary key is a set of one or more attributes that, taken collectively allows us to identify uniquely an entity in the entity-set.

Ex.1) An-number in the loan relation 2) Also the combination of branch-name and loan-number

Candidate Key: Several distinct sets of attributes could serve as candidate key

Referenced key:It is a unique or a primary key, which is defined on a coloumn belonging to the parent table.

Foreign Key: A coloumn or combination of coloumns included in the definition of referential integrity, which would refer to a referenced key.

Child table: This table depends upon the values present in the referenced key of the parent table, which is referred by a foreign key.

Parent table: This table determines whether insertion or updation of data can be done in child table. This table would be referred by child table’s foreign key.

On delete cascade clause

If all rows under the referenced key coloumn in a parent table are deleted, than all rows in the child table with dependent foreign key will also be deleted automatically.

Entity-Relationship Diagrams:

An E-R diagram can express the overall logical structure of a database graphically. Such a diagram consists of the following major components:

The symbol used to represent entity is rectangle

The symbol used to represent attribute is ellipse

The symbol used to represent links is lines _______

The symbol used to represent the relation is

The symbol used to represent multivalued attributes is Double ellipses

The symbol used to represent the derived attributes is dashed ellipses

27

Page 28: DBMS

The symbol used to represent the total partition of entity in a relationship set is double lines.

E-R diagram for a Banking-Enterprise

Various relations used for the discussion of this chapter are

1.Account relation

Branch-name Account-number BalanceDowntownMianusPerry ridgeRound HillBrightonRedwoodBrighton

A-101A-215A-102A-305A-201A-222A-217

500700400350900700750

28

account

Account-number Balance

Account-branch

branch

Branch-city

Branch-name

Assets

Deposit-or

customer

Customer-name

Customer-city

Customer-street

Borro-wer

loan

Loan-number

Amount

Loan-branch

Page 29: DBMS

2.Loan relation

3.Branch relation

Branch-name Branch-city AssetsDowntownRedwoodPerryridgeMianusRound hillPownalNorth townBrighton

BrooklynPalo altoHorse neckHorse neckHorse neckBenningtonRyeBrooklyn

900000021000001200000400000800000030000037000007100000

4.Customer relation

Customer-name

Customer-street Customer-city

JonesSmithHayesCurryLindsayTurnerWilliamsAdamsJohnsonGlennBrooksGreen

MainNorthMainNorthParkPutnamNassauSpringAlmaSand HillSenatorWalnut

HarrisonRyeHarrisonRyePittsfieldStamfordPrincetonPittsfieldPalo AltoWoodsideBrooklynStamford

Branch-name Loan-number AmountDowntownRedwoodPerry ridgeDowntownMianusRound HillPerry ridge

L-17L-23L-15L-14L-93L-11L-16

10002000150015005009001300

29

Page 30: DBMS

5.Depositor relation Customer-name

Account-number

JohnsonSmithHayesTurnerJohnsonJonesLindsay

A-101A-215A-102A-305A-201A-217A-222

6.Borrower relation

Customer-name

Loan-number

JonesSmithHayesJacksonCurrySmithWilliamsAdams`

L-17L-23L-15L-14L-93L-11L-17L-16

Relational Algebra

Note: Query languages A query language is a language in which a user requests information from the database. These languages are typically of a level higher than that of a standard programming language. Query languages can be categorized as being either procedural or non-procedural .In procedural language, the user instructs the system to perform a sequence of operations on the database to compute the desired result. In a non-procedural language, the user describes the information desired without giving a specific procedure for obtaining that information.

30

Page 31: DBMS

Introduction

Relational algebra is a collection of operations on relations. Also it is a procedural query language, it consists of a set of operations that take one or two relations as input and produce a new relation as their result.

The fundamental operations or traditional set operations available with relational algebra are select, project, set difference, Cartesian, rename, union. In addition to the fundamental operations, there are several other operations-namely, set intersection, natural join, division, and assignment. These operations will be defined in terms of the fundamental operations. Also we can state the selction, projection, join and division operations as special relational operators.

Fundamental operations

The select, project and rename operations are called unary operations, because they operate on one relation. The other three operations union, setdifference and Cartesian product operate on pairs of relations and are, therefore called binary operations.

The select operation

The select operation selects tuples that satisfy a given predicate. The lowercase Greek letter sigma () is used to denote selection. The predicate appear as a subscript to . The argument relation is given in parenthesis following the .

Example: 1.Select those tuples of the loan relation where the branch is “Perryridge”.

branch _name=”perryridge”(loan) The result of the query is

2.Find all tuples in which the amount lent is more than $1200 Amount>1200(loan) All comparisons using =,, <,,≥ in the selection predicate. Also we can combine larger predicates using the connectives and (^) and or (۷).

3.Find those tuples pertaining to loans of more than $1200 made by Perryridge branch

branch _name=”perryridge”^amount>1200(loan)

Branch-name Loan-number AmountPerryridgePerryridge

L-15L-16

15001300

31

Page 32: DBMS

The project operation

Suppose we want to list all loan numbers and the amount of the loans, but do not care about the branch name. The project operation allows us to produce this relation. The project operation is a unary operation that returns its argument relation, with certain attributes left out. Since a relation is a set, any duplicate rows are eliminated. Projection is denoted by the Greek letter pi (π). We list those attributes that we wish to appear in the result as subscript to π.The argument relation follows in parentheses.

Example: 1.List all loan numbers and the amount of the loan .The corresponding query is

π loan-number,amount(loan) The relation that results from this query is

Loan-number AmountL-17L-23L-15L-14L-93L-11L-16

10002000150015005009001300

The set difference operation

The set-difference operation, denoted by -, allows us to find tuples that are in one relation but are not in another. The expression r – s results in a relation containing those tuples in r but not in s.

Example: 1.Find all customers of the bank who have an account but not a loan

π customer-name (depositor) – πcustomer-name (borrower) The result will be

Customer-nameJohnsonTurnerLindsay

For a set difference operation r-s to be valid, we require that the relations r and s be of the same arity, and that the domains of the ith attribute of r and the ith attribute of s be the same.

32

Page 33: DBMS

The cartesian – product operation

The Cartesian-product operation, denoted by a cross (X), allows us to combine information from any two relations. We write the Cartesian product of relations r1 and r2 as r1 X r2. Since the same attribute name may appear in both r1 and r2, we need to devise a naming schema to distinguish between these attributes. We do so here by attaching to an attribute the name of the relation from which the attribute originally came. For example, the relation schema for r = borrower X loan is

(borrower.customer-name,borrower.loan-number,loan.branch-name,loan.loan-number,loan.amount)So now we can distinguish borrower.loan-number from loan.loan-number.For those attributes that appear in only one of the two schemas,we shall usually drop the relation-name prefix.We can wrte the relation schema for r as (customer-name,borrower.loan-number,branch-name,loan.loan-number,amount) This above naming convention requires that the relations that are arguments of the Cartesian-product operation have distinct names.

Assume that we have n1 tuples in borrower and n2 tuples in loan. Then, there are n1 * n2 ways of choosing a pair of tuples –one tuple from each relation; so there are n1*n2 tuples in r. In particular ,note that for some tuples t in r,it may be that t[borrower. loan-number] not equal to t[loan.loan-number]. In general ,if we have relations r1(R1) and r2(R2),then r1 X r2 is a realtion whose schema is the concatenation of R1 and R2.Relation R contains all tuples t for which there is a tuple t1 in r1,and t2 in r2 for which t[R1]=t1[R1] and t[R2]=T2[R2].

For example

1.if we want to find the names of all customers who have a loan at the Perryridge branch.We need the information in both the loan relation and the borrower relation to do so.If we write

branch-name=”Perryridge”(borrower X loan) Customer-name Borrower.loan-

numberBranch-name Loan.loan-numberAmount

JonesJones…….…….…….AdamsAdams

L-17L-17…….…….…….L-16L-16

DowntownRedwood…….…………Round hillPerryridge

L-17L-23……..…….…….L-11L-16

10002000…..…..…..9001300

Table:Result of borrower X loan

Now the output of the query stated above will be as

33

Page 34: DBMS

Customer-name Loan-number Branch-name Loan-number AmountJonesJonesSmithSmithHayesHayesJacksonJacksonCurryCurrySmithSmithWilliamsWilliamsAdamsAdams

L-17L-17L-23L-23L-15L-15L-14L-14L-93L-93L-11L-11L-17L-17L-16L-16

PerryridgePerryridgePerryridgePerryridgePerryridgePerryridgePerryridgePerryridgePerryridgePerryridgePerryridgePerryridgePerryridgePerryridgePerryridgePerryridge

L-15L-16L-15L-15L-15L-16L-15L-16L-15L-16L-15L-16L-15L-16L-15L-16

1500130015001300150013001500130015001300150013001500130015001300

Table:result of query branch-name=”Perryridge”(borrower X loan)

The relation describes the details relating to perryridge branch alone.But there is a chance that many customers may not have a loan at perryridge branch.So the query can be re-written as

borrower.loan-number=loan.loan-number

( branch-name=”Perryridge”(borrower X loan))

In order to retrieve only the customer-name ,we vcan have the projection operation as

customer-name(borrower.loan-number = loan.loan-number

(branch-name=”Perryridge”(borrower X loan)

The result is as shown below

Customer-nameHayesAdams

Table:Result of customer-name(borrower.loan-number = loan.loan-number

(branch-name=”Perryridge”(borrower X loan)

The rename operation

Unlike relations in the database, the results of relational-algebra expressions do not have a name that we can use to refer to them. It is useful to be able to give them names; the rename operator, denoted by the lower-case Greek letter rho (), lets us perform this task.

34

Page 35: DBMS

Given a relational-algebra expression E, the expression x(E) returns the result of expression E under the name x.

A relation r by itself is considered to be a trivial relational-algebra expression. Thus, we can also apply the rename operation to a relation r to get the same relation under a new name.

A second form of the rename operation is as follows. Assume that a relational-algebra expression E has arity n. Then the expression x(A1,A2,.....An)(E) returns the result of expression E under the name x,and with the attributes renamed to A1,A2,.....An.

For example,

1.Find the largest balance in the bank Steps invloved are

Compute first the relation consisting of those balances that are not the largest

The take the set difference between the relation balance(account) Then comes the temporary relation

The corresponding queries are

account.balance( account.balance < d.balance(account X d (account)))

This expression gives those balances in the account relation for which a larger balance appears somewhere in the account relation(renamed as d).The result contains all balances except the largest one. The relation is

Balance500700400350750

The query to find the largest account balance in the bank can be written as follows: balance(account) –

account.balance (account.balance <d.balance(account X d (account))) the result of this query is

Balance900

35

Page 36: DBMS

Fig: largest account balance in the bank

2.Find the names of all customers who live on the same street and in the same city as Smith The street and city of smith can be obtained by writing as

customer-street,customer-city(customer-name=”Smith”(customer))

In order to find other customers with this street and city, we must reference the customer relation a second time. In the following query, we use the rename operation on the preceding expression to give its result the name smith-addr, and to rename its attributes to street and city, instead of customer-street and customer-city:

customer.customer-name

(customer.customer-street=smith-addr.street^customer.customer-city=smith-addr.city

(customer X smith-addr(street,city)

(customer-street,customer-city(customer-name=”Smith”(customer)))))

The result of this query is as shown below

Customer-nameSmithcurry

Additional operations or special relational operations

1.The set-intersection operation The symbol used to identify is .

Example: 1.Find all customers who have both a loan and an account. Query is

customer-name(borrower) customer-name(depositor) The result will be

Customer-nameHayesJonesSmith

Table: customers with both an account and a loan at the bank

The intersection operation can be replaced using the set difference operation as r s =r-(r-s)

The Union operation

36

Page 37: DBMS

With the help of this operation we can choose the details which are present in either of two relations.

For example:

1.Find the names of all bank customers who have either an accoubt or a loan or both. The customer relaion does not contain the information ,since a customer does not need to have either an account or a loan at the bank.And to answer this query we need the information in the depositor relation and in the borrower relation . *To find the customers with loan at the bank we use

customer-name(borrower) *To find the names of all customers with an account in the bank:

customer_name(depositor) To find both account and loan holding customers we need to union these two as

Customer-name(borrower) customer-name(depositor) The result of this query is

Customer-nameJohnsonSmithHayesTurnerJonesLondsayJacksonCurryWilliamsAdams

For union operation r U s to be valid, we require two conditions:

1.The relations r and s must be of the same arity. That is, they must have the same number of attributes. 2.The domain of the ith attribute of r and the ith attribute of s must be the same, for all i. Where r and s can be, in general temporary relations that are the result of relational-algebra expressions.

The natural-join operation

It is often desirable to simplify certain queries that require a Cartesian product. A query that involves a Cartesian product includes a selection operation on the result of the Cartesian product.

Assume:

37

Page 38: DBMS

Find the names of all customers who have a loan at the bank, and find the amount of the loan. Steps : 1.Form the Cartesian product of the borrower and loan relations. 2.Select those tuples that pertain to only the same loan-number. 3.Project the customer-name,loan-number and amount.

customer-name,loan.loan-number,amount

(borrower.loan-number=loan.loan-number(borrower X loan))

The natural join is a binary operation that allows us to combine certain selections and a Cartesian product into one operation. It is denoted by the “join” symbol ⋈.The natural-join operation forms a Cartesian product of its two arguments, performs a selection forcing equality on those attributes that appear in both relation schemas, and finally removes duplicate attributes.

For example: 1.Find the names of all customers who have a loan at the bank, and find the amount of the loan.

customer-name,loan-number,amount(borrower⋈ loan) The result of the query is

Customer-name Loan-number AmountJonesSmithHayesJacksonCurrySmithWilliamsAdams

L-17L-23L-15L-14L-93L-11L-17L-16

100020001500150050090010001300

2.find names of all branches with customers who have an account in the bank and who live in Harrison

branch-name( customer-city=”Harrison”(customer ⋈ account ⋈ depositor)) The result of the query is

Branch-nameBrightonPerryridge

The division operation

The division operation, denoted by, is suited to queries that include the phrase “for all”.

Example: 1.Find all customers who have an account at all the branches located in Brooklyn.

38

Page 39: DBMS

Steps: 1.All branches in Brooklyn can be obtained as r1= branch-name( branch-city=”Brooklyn”(branch))

The result is

We can find all (customer-name,branch-name) pairs for which the customer has an account at a branch by writing

r2=customer-name,branch-name(depositor⋈ account)

Table:Result of customer-name,branch-name(depositor⋈ account)

Our question is to find those customers who appear in r2 with every branch name in r1.We formulate the query by writing

customer-name,branch-name(depositor ⋈ account)

⊹ Branch-name( branch-city=”Brooklyn”(branch)) Extended relational-algebra operations

The basic relational-algebra expressions have been extended in several ways. A simple extension is to allow arithmetic operations as part of projection. An important extension is to allow aggregate operations, such as computing the sum of the elements of a set, or their average. Another important extension is the outer-join operation, which allows relational-algebra expressions to deal null values, which model missing information.

Generalized Projection The generalized projection operation extends the projection operation by allowing arithmetic functions to be used in the projection list. The generalized projection has the form F1,F2,……Fn(E)

Branch-nameBrightonDowntown

Customer-name Branch-nameJohnsonSmithHayesTurnerWilliamsLindsayJohnsonJones

DowntownMianusPerryridgeRound hillPerryridgeRedwoodBrightonBrighton

39

Page 40: DBMS

Where E is any relational-algebra expression, and each F1, F2,…Fn are arithmetic expressions involving constants and attributes in the schema of E.As a special case, the arithmetic expression may be simply an arithmetic or a constant. The following example demonstrates the basis for the use of the generalized projection operation. Suppose we have a relation credit-info, as shown, which lists the credit limit and expenses so far .If we want to find how much more each person can spend, we can write the following expression:

customer-name,limit - credit-balance(credit-info)

Customer-name Limit Credit-balance

JonesSmithHayesCurry

6000200015002000

70040015001750

Table:The credit-info relation

Customer-name Limit-credit_balance

JonesSmithHayesCurry

530016000250

The result of customer-name, limit - credit-balance (credit-info)

Outer join

The outer-join operation is an extension of the join operation to deal with missing information.

Aggregate functions

Aggregate functions are functions that take a collection of values and return a single value as a result. For example, the aggregate function sum takes a collection of values and returns the sum of the values.

The function sum applied on the collection <1,1,3,4,4,11>returns the value 24.

40

Page 41: DBMS

The function avg returns the average of the values. So average of the above is 4.

The function count returns the number of the elements in the collection and would return 6 on the preceding collection.

The functions min and max, returns the minimum and maximum values in a collection; they return 1 and 11.

Examples:

1.Find out the total sum of salaries of all part-time employees in the bank.

The query is Sum salary (pt-works) The result of this query is a relation with a single attribute, containing a single row with a numerical value corresponding to the sum of all the salaries of all employees working part-time in the bank.

Refer for further details of aggregate functions in the text

1.Database system concepts -Abraham Silberschatz,Henry K.Forth

2.Refer ‘An introductin to database systems’ –chapter 4 -Bipin P.Desai for relational approach.

Short questions:

1.What is relational approach.2.What is relational algebra.3.Write the definition for relational algebra.4.What are the fundamental operations of relational algebra.5.What is entity, relation, entity set, relaionship, relationship set, attribute.6.Briefly explain mapping cardinalities.7.Draw the entity relationship diagram for banking enterprise.8.Explain selection and projection operation with example.9.Explain aggregate functions in brief.10.Explain set operations.11.Explain binary, unary, ternary and n-ary relations.12.What are the various symbols used in entity relationship diagram.13.What is constraint?14.Write note on integrity rules.15.What is a key?

41

Page 42: DBMS

Elaborate questions:

1.Write the definition for key and explain various keys with example.2.Explain the structure of relational databases with example.3.Explain referential integrity constraint or rule, with example. 4.Explain all fundamental operations of relational algebra or traditional set operations with example.5.Write all aggregate functions and explain in detail with example.6.What is extended relational operations and explain all the available operations.

STUDY MATERIAL

Course :B.Com CASemester:III

Subject :Data Base Management System

Unit :Three

_______________________________________________________________________

Unit III Syllabus

Embedded SQL:Introduction –operators not involving cursors, involving cursors-Dynamic statements. Query by example-retrieval operations, builtin-functions, update operations, QBE Dictionary.Normalization: Functional Dependency, First, Second, third normal formd, relations with more than one candidate key, good and bad decomposition.

Books for Reference:

An introduction to database system - C.J.Date

Database system Concepts - Abraham silberschatz, Henry F.Korth, S.Sudharsan

Principles of database system -Aho D.Ullman

Embedded SQL

42

Page 43: DBMS

SQL provides a powerful declarative query language; writing queries in SQL are typically much easier than is coding the same queries in a general-purpose programming language. To access a database from a general-purpose programming language is for the following two reasons. 1.Not all queries can be expressed in SQL, since SQL does not provide the full expressive power of a general-purpose language. That is, there exists queries that can be expressed in a language such as Pascal, C, COBOL or FORTRAN that cannot be expressed in SQL write queries, we can embed SQL within a more powerful language 2.Nondeclarative actions-such as printing a report, interacting with a user, or sending the results of a query to a graphical user interface-cannot be done from within SQL.

A language in which SQL queries are embedded is referred to as host language, and the SQL structures permitted I the host language constitute embedded SQL.

Languages such as PL/I however are not well equipped to handle more that one record at a time. It is therefore necessary t provide some form of bridge between the two functional levels and embedded SQL provides such a bridge by means of a new type of object called a cursor.

Operations not involving cursors The DML statements that do not need cursors are as follows:

“Singleton SELECT” UPDATE INSERT DELETE

Singleton SELECT

We use the term “singleton SELECT “ to mean statement for which the retrieved table contains at most one row. Example: SELECT statement

UPDATE

This statement can be executed to have changes in the databases designed. Example: UPDATE, statement of SQL.

INSERT

This statement is used to include new row or information. Example: INSERT, statement of SQL.

43

Page 44: DBMS

DELETE

This is used to delete information from the database. Example: DELETE, statement of SQL.

Operations involving cursors

Consider the case of a SELECT that selects a whole set of records, not just one. What is needed is a mechanism for accessing the records in the set one by one; and cursors provide such a mechanism. Explicitly defined cursors are constructs that enable the user to name an area of memory to hold a specific statement for access at a later time. The programmer to process a multiple-row active set one record at a time defines explicit cursors. The following are steps for using explicitly defined cursors within PL/SQL.

1.Declare the cursor * Name the cursor * Each cursor associates a query with cursor

SyntaxDeclare cursor-name is select statement

ExampleDeclare c_names is select branch_name from branch where branch_city=’Brooklyn’;

2.Open the cursorOpening the cursor activates the query and identifies the active set.

Open also initializes the cursor pointer to just before the first row of the active set.

SyntaxOpen cursor-name;

3.Fetching the cursor

Getting data into the cursor is accompolished with the fetch command.The fetch command retrieves the rows in the cursor set one row at a time.

SyntaxFetch cursor-name into record-list;

44

Page 45: DBMS

4.Closing the cursor

The close statement closes or deactivates the previously opened cursor and makes the active set undefined oracle will implicitly close a cursor when the user’s program or see\ssion is terminated.After a cursor is closed ,we cannot perform any operation on it.

SyntaxClose cursor-name;

Attributes involved in cursors

%ISOPEN returns TRUE if the cursor is already OPEN %FOUND returns TRUE if the last FETCH returned a row, and

returns FALSE if the last FETCH failed to return a row.

%NOTFOUND is the logical opposite of %FOUND. %ROWCOUNT yields the number of rows fetched.

Example to illustrate cursor1) Declare

Cursor c4 is select salary,job from emp where job=’CLERK’;Begin

if c4%isopen thendbms.output.put_line(‘This message will not be displayed’);else

open c4;dbms.output.put_line(‘Cursor not found’);

end if;close c4;

end;

2) The procedure to update students information by finding the total and average.

Declarest stu%rowtype;cursor c1 is select * from stu;

BeginOpen c1;loop; fetch c1 into st;

exit when c1%notfound;st.tot1l:=st.m1+st.m2+st.m3;st.average:=st.total/3;

45

Page 46: DBMS

if st.m1>=50 and st.m2>=50 and st.m3>=50 thenst.result:=’PASS’;

elsest.result:=’FAIL’;

end if;update stu set

total=st.total,average=st.average,result=st.result where regno=st.regno;end loop;commit;

end;

Dynamic Statements

Embedded SQL provides certain features to facilitate the writing of on-line application programs that is programs to support on-line access to the database from an end-user at the terminal. Steps involved are

1.accept a command from the terminal 2.analyze the command 3.issue appropriate SQL statements 4.return a message and/or results to the terminal

The precompiler is a compiler for the SQL language. Suppose the application programs have written a program P that includes some embedded SQL statements.

Pre-compilation proceeds as follows.

The precompiler scans the source program P and locates the embedded SQL statements.

For each statement it finds the precompiler decides on a strategy for implementing that statements in terms of RSI operations. This process is referred to as optimization

The precompiler replaces each of the original embedded SQL statements by an ordinary PL/I statement

The dynamic SQL component of SQL-92 allows programs to construct and submit SQL queries at run-time. In case of embedded SQL, each statement must be completely present at compile time, and are compiled by the embedded SQL preprocessor. Using dynamic SQL, programs can create SQL queries as strings at run-time (based on i/p from the user) and can either have them executed immediately, or have them prepared for subsequent use. The two principal dynamic statements are PREPARE and EXECUTE.

DCL SQLSOURCE CHAR (256);

46

Page 47: DBMS

SQLSOUCE =’DELETE FROM BRANCH WHERE BRANCH_CITY=’PERRYRIDGE’;

$PREPARE SQLOBJ FROM SQLSOURCE:$EXECUTE SQLOBJ:

The PREPARE statement passes the SQLSOURCE string to the RDS precompiler which goes through its normal process of parsing, optimization, code generation and builds a machine language versions of the statement called SQLOBJ.EXECUTE statement causes this machine language routine to be executed and thus causes the actual deletions to occur.Once PREPAREd ,a given dynamically generated SQL statement can be

EXECUTED many times. The generated statement can be replaced by another by issuing PREPARE again with the same target and a different source.

QUERY-BY-EXAMPLE

Query-by-example (QBE) is the name of both a data-manipulation language and the database system that included this language. The QBE database system was developed at IBM T.J.Watson Research center in the early 1970s.Today,some-database systems for personal computers support variants of QBE languages. It has two distinctive features: 1.Unlike most query languages and programming languages, QBE has a two-dimensional syntax: Queries look like tables. A query in one-dimensional language can be written in a one line. A two-dimensional language requires two dimensions for its expression.2.QBE queries are expressed “by example”. Instead of giving a procedure for obtaining the desired answer, the user gives an example of what is desired. The system generalizes this example to compute the answer to the query.

We express queries in QBE using skeleton tables. These tables show the relation schema as shown below.

Example the representation of branch relation

Branch Branch name

Branch city assets

Retreival operations

47

Page 48: DBMS

Queries on One relation

Examples:

1:Find all loan numbers at the Perryridge branch

Loan Branch-name

Loan-number

Amount

Perryridge P._x

The proceeding query causes the system to look for tuples in loan that have “perryridge” as the value for the branch-name attribute. For each such tuple the value of the loan-number attribute is assigned to the variable x. The value of the variable x is “printed”, because the command P. appears in the loan-number coloumn next to the variable x.QBE assumes that a blank position in a row contains unique variable.As a result,if a variable does not appear more than once in a query,it may be omitted.

Thus the previous query can be re-written as

Loan branch-name loan-number amount Perryridge P.

QBE performs duplicate elimination automatically.To suppress the duplicate elimination,we insert the command ALL. After the P. command:

Loan branch-name loan-number amount Perryridge P.ALL

To display the entire loan relation ,we can create a single row consisting of P. in every field.

Loan branch-name loan-number amountP.

QBE allows queries that involve arithmetic comparisons

Example

48

Page 49: DBMS

1.Find the loan numbers of all loans with a loan amount of more than $700.

Loan Branch-name Loan-no. Amount P.>700

The arithmetic operations that QBE supports are =,<,≤,≥ and ¬

2.Find the names of all branches that are not located in Brooklyn.

Branch Branch-name Branch-city Assets

P. ¬Brooklyn

3.Find the loan-no. of all loans made jointly to Smith and Jones.

Borrower Customer-name Loan-no. ‘Smith’ P._x ‘Jones’ _x

4.Find the loan numbers of all loans made to smith ,to Jones or to both jointly.

Borrower customer-name loan-no. ‘Smith’ P._x ‘Jones’ P._y

5.Find all customers who live in the same city as Jones.

Customer Customer-name Customer-street Customer-cityP._x _yJones _y

Queries on several relations

QBE allows queries that span several different relations. The connections among the various relations are achieved through variables that force certain tuples to have the same value on certain attributes.

Example

49

Page 50: DBMS

1.Find the names of all customers who have a loan from the ‘perryridge’ branch.. loan branch_name loan_no. amount

perryridge _x

borrower cust_name loan_no.

P._x _x

2.Find the names of all customers who have both an account and a loan at the bank.

Depositor customer-name account-no.

P._x

Borrower customer-name account-no.

_x

3.Find the names of all customers who have an account at the bank ,but who have a loan from the bank.

Depositor customer-name account-no.P._x

Borrower customer-name loan-no._x

4.Find all customers who have atleast two account.

Depositor customer-name account-no.

P._x _yx y

The condition box

It is not convenient to express all the constraints on the domain variables within the skeleton tables. To overcome this QBE includes a

50

Page 51: DBMS

condition box feature that allows the expression of general constraints over any of the domain variables.

Example:

1:Find all customers who are not named ‘Jones’ and who atleast two account.

Depositor customer-name account-no.

P._x _yx y

2.Find all account-no. with a balance between $1300 and $1500 ,we write

acc-no. branch-name acc-no. balanceP. _x

3.Find all branches that have assests greater than those of atleast one branch loacated in ‘Brooklyn’.

Branch branch-name branch-city assets

P._x _y Brooklyn _x

51

Conditions -Y>_z

Conditions

_x.≥1300_x≤1500

Conditions

_Y >_z

Page 52: DBMS

Options available with condition Box 1.QBE allows complex arithmetic expressions to appear in a condition box.Example:Find all branches that have assets that are atleast twice as large as the assets of one of the branches located in Brooklyn.

Branch branch-name branch-city assets

P._x _y Brooklyn _x

2.QBE allows logical expressions to appear in condition box.Operators used are and( & ),or( | )

Example

Find all account numbers with a balance between $1300 and $2000 but not exactly $1500.

Account branch-name account-no. balanceP. _x

The result relation

If the result of a query includes attributes from several relation schemas, we need a mechanism to display the desired result in a single table.Example

52

Conditions

_x=( ≥1300 and ≤2000 and ┐1500)

Page 53: DBMS

1.Find the customer-name, account-no. and balance for all accounts at the perryridge branch

In relational algebra1.Join depositor and account relation2.project customer-name, account-no. and balance

QBE related with this.

1.Create a skeleton table called result with attributes customer-name, account-no. and balance.

Account branch-name account-no. Balance

Perryridge _y _z

Depositor customer-name account-no.

_x _y

Result customer-name account-no. Balance

P. _x _y _z Ordering of the display of tuples

By using the command AO. And DO. we can order the contents.

Example

1.List all customers in descending alphabetical order.

Depositor customer-name account-no.

P.DO.

Aggregate functions[Built-in functions]

53

Page 54: DBMS

QBE includes the aggregate operators AVG, MAX, MIN, SUM and CNT.we must postfix these operators with ALL. to create a multiset on which the aggregate operation is evaluated.

Example

1.Find the total balance of all the account maintained at the perryridge branch.

Account branch-name account-no. balance

Perryridge P.SUM ALL.

2.Find the total no. of customers who have an account at the bank.

Depositor customer-name account-no.

P.CNT.UNQ.ALL.

3.Find the name,street and city of all customers who have more than one account at the bank.

Customer cust-name cust-street cust-city

P. _x

Depositor Cust-name Account-No.

G._x CNT.ALL._y

Update operations/Modification of the database This section deals with the options how to add, remove or change information using QBE.

Deletion

54

Conditions

CNT.ALL._y > 1

Page 55: DBMS

Deletion of tuples from a relation is expressed in much the same way as a query. The major difference is the use of D. in the place of P..In QBE we can delete whole tuples, as well as values in selected coloumns. To delete information in only some of the columns, null values, specified by-are inserted.

D. Operates on only one relation. To delete tuples from several relations, we must use one D. operator for each relation.

*Delete customer smith

customer cust_name cust_street cust_city D. Smith

*Delete the branch-city value of the branch whose name is “Perryridge”.

Branch branch-name branch-city asstes

Perryridge D.

*Delete all loans with a loan amount between $1300 and $1500

Loan Branch-name loan-no. amount D. _y _x

Borrower cust_name loan_no.D. _y

*Delete all accounts at all branches located in Brooklyn.

Account branch_name account_no. balance

D. _x _y

Depositor cust_name acc_no.

D. _y

55

Condition

_x=(>=1300 and <= 1500)

Page 56: DBMS

branch branch_name branch_city assets

_x Brooklyn

Insertion

We do the insertion by placing the I. Operator in the query expression.The attribute values for inserted tuplles must be members of the attributes domain

Example

*To insert into the branch relation information about a new branch with name “Capital” and city “Queens”,but with a null asset value,we write

branch branch_name branch_city assets I. Capital Queens

*To insert the account A-9732 at the Perryridge branch has a balance of $700.

Account branch-name account_no. balanceI. Perryridge A-9732 700

Updates

If we want to changeone value in a tuple withput changing all values in the tuple we use the update facility and the operartor used is U. .QBE allows users to update the primary key fields.

Update the asset value of the Perryridge branch to $10,000,000

Branch branch-name branch-city assets

Perryridge U. 100000000

56

Page 57: DBMS

The query updates the assets of the Perryrigde branch to $10,000,000 regardless of the old values.If we want to update a value using the previous vaulue ,we must express a request using two rows:One specifying the old tuples that need to be updated,and the other indicating the new updated tuples to be inserted in the database

The interesty payments are being made,and all branches are to be increased by 5%.

Account branch-name account-no. balance

U. _x * 1.05 _x.

QBE Dictionary

QBE has a built-in dictionary that is represented to the user as a collection of tables. The dictionary include for example, a TABLE and a DOMAIN table, giving details of all tables and all domains currently known to the system. The dictionary tables can be interrogated using the ordinary retrieval operations of the DML.

Retrieval of table-names

Get the names of all tables known to the system.

P.

Instead of having to build a skeleton for the TABLE table and entering “P.” in the NAME column of that skeleton, the user can formulate this query by simply entering the “P.” in the table-name position of the blank table.

Retrieval of column-name for a given table

Get names of all columns in table S. S P.

57

Page 58: DBMS

User enters the table-name (S) followed by “P.” against the row of (blank) column-names.

Creation of a new table

1.Create table branch

I. branch I. Branch name branch city branch street

The first I. Creates a dictionary entry for table branch; the 2nd I. Creates dictionary entries for the four columns of the table branch. Also the information for each column must be specified .The information includes the name of the underlying domain; the data-type of the domain; if that domain is not already known to QBE.

Dropping a table

Drop table branch.

A table can be dropped only if it is currently empty.

1)Delete all branch details

branch branch name branch city branch street

D.

2)Drop the table

D. Branch branch name branch city branch street

Expanding a table

Add a asset coloumn to the table branch.

QBE does not directly support the dynamic addition of a new column to an existing table is currently empty.

So the following steps should be followed.

58

Page 59: DBMS

1) Define a new table the same shape as the existing table plus the new column.2) Load the new table from the old using a multiple-record insert.3) Delete all data from the old table.4) Drop the old table.5) Change the name of the new table to that of the old table.

Normalization

Introduction

Normalization theory is build around the concept of normal forms. A relation is said to be in a particular normal form if it satisfies a certain specified set of constraints. For example, a relation is said to be in first normal form if and only if it satisfies the constraint that it contains atomic values only. Various normal forms are First Normal Form, Second Normal Form, Third Normal Form, DKNF, and BCNF etc. Concept of normalization arises in the case to design a relational-database without unnecessary redundancy, easy way of retrieval etc…So if we want to design such a database we go for normalization.

For the description of normalization, we shall consider the supplier-and-parts database. The database or relation is as follows:

PART---P

SP------

P#Pname

Color Weight City

P1P2P3P4P5P6

NutBoltScrewScrewCamCog

RedGreenBlueRedBlueRed

121717141219

LondonParisRomeLondonParisLondon

S# Sname Status CityS1S2S3S4S5

SmithJonesBlakeClarkAdams

2010302030

LondonParisParisLondonAthens

S# P# QTYS1S1S1S1S1S1S2S2S3S4S4S4

P1P2P3P4P5P6P1P2P2P2P4P5

300200400200100100300400200200300400

59

Page 60: DBMS

FIG:1

Functional Dependency

Definition:

Given a relation R, attribute Y of R is functionally dependent on attribute X of R if and only if each X-value in R has associated with it precisely one Y-value in R.

In the supplier-and-parts database the attributes SNAME, STATUS and CITY of a relation S are each functionally dependent on attribute S#. For a particular value for S# there exists precisely one corresponding value for each of SNAME, STATUS and CITY.

S.S# S.SNAMES.S# S.STATUSS.S# S.CITY

Or we can say represent asS.S#S. (SNAME, STATUS, CITY)

The statement S.S#S.CITY is read as “attribute S.CITY is functionally dependent on attribute S.S#”, or “attribute S.S# functionally determines attribute S.CITY”.

Alternate definition for functional dependence

Given a relation R, attribute Y of R is functionally dependent on attribute X of R if and only if, whenever two tuples of R agree on their X-value, they also agree on their Y-value.

S# P# Qty StatusS1S1S1S1

P1P2P3P4

300200400100

20202020

Fig: Partial tabulation of relation SP’.

For example in this relation SP’

60

Page 61: DBMS

SP’.S#SP’.STATUS

A functional dependence is a special form of integrity constraint. For example, if a relation S satisfies the FD S.S#S.CITY then we say that every legal extension of that relation satisfies that constraint.It is convenient to represent the FDs in a given set of relations by means of a

functional dependency diagram.

Example:

Fig: Functional dependencies in relations S, P, SP.

Various Normal Forms

Brief description of Normal forms

First Normal Form

Eliminates repetition of data that is converts each data value to its atomic form No two rows should be identical Each table entry should be single valued Every table has a primary key, which is a unique label or identifier for each row

Second Normal Form

Requires taking out data that is only dependent on a part of the key

Each non-key attribute is functionally dependent on the entire key

Third Normal form

61

S# STATUS

SNAME CITY

P#

PNAME

COLOR

WEIGHT

CITY

QTY

S#

P#

Page 62: DBMS

Involves getting rid of anything in the tables that does not depend solely on the primary key 3NF is sometimes characterized as “the key, the whole key, and nothing but the key”

First Normal Form

Definition:

A relation R is in first normal form(1NF) if and only if all underlying domain contain atomic values only.

A relation that is only in first normal form has a structure that is undesirable for a number of reasons.

For example:

Let us assume that information concerning suppliers and shipments, rather than being split into two separate relations (S and SP) is combined into a single relation and let the name be FIRST with fields (S#, STATUS, CITY, P#, QTY).

Where S# represents the supplier number, STATUS represents the supply details, CITY represents the city where the supply has been made P# represents the Part number, QTY represents the quantity of supply.

Here the constraint is STATUS is functionally dependent on CITY. That is the meaning of this constraint is that a supplier’s status is determined by the corresponding location: e.g., all LONDON suppliers must have a status of 20.Also we ignore the attribute SNAME for simplicity The primary key of FIRST is the combination of (S#, P#). The following is the functional dependency diagram for this relation

Fig: Functional dependencies in the relation FIRST

In the diagram

i) STATUS and CITY are not functionally dependent on the primary key.

62

QTY

S#

P#

STATUS

CITY

Page 63: DBMS

ii) STATUS and CITY are not mutually dependent.

Certain difficulties of the FIRST relation occurs while UPDATION.They are explained as

Insert: We cannot enter the fact that a particular supplier is located in a particular city until that supplier supplies at least one part. The following is the tabulation of FIRST.

S# STATUS CITY P# QTY

S1

S1

S1

S1

S1

S1

S2

S2

S3

S4

S4

S4

20

20

202

20

20

20

10

10

10

20

20

20

London

London

London

London

London

London

Paris

Paris

Paris

London

London

London

P1

P2

P3

P4

P5

P6

P1

P2

P2

P2

P4

P5

300

200

400

200

100

100

300

400

200

200

300

400

Table: FIRST

The FIRST relation does not show that supplier S% is located in ATHENS. Because until S5 supplies some part, we have not appropriate primary key value.

Deletion: If we delete the only FIRST tuple for a particular supplier, we destroy not only the shipment connecting that supplier to some part but also the information that the supplier is located in a particular city.

For example if we delete the FIRST tuple with S# value S# and P# value P2, we lose the information that S3 is located in Paris.

Updation: the city value for a given supplier appears in FIRST many times, this redundancy causes update problems.

For example, if supplier S1 moves from London to Amsterdam then the two difficulties occurs. They are

Searching the FIRST relation to find every tuple connecting S1 and London and this produces an inconsistent result. The solution to these problems is to replace the relation FIRST by the two relations SECOND (S#, STATUS, CITY) and SP (S#, P#, QTY). The functional dependency diagrams for these two relations are as shown here.

Fig:Functional dependencies in the relation SECOND and SP.

63

S#

STATUS

CITY

S#

P#

CITY

Page 64: DBMS

The following tables shows the sample tabulations corresponding to the data values of FIG:1 except the information for supplier S5 has been included in SECOND and not in SP.

SECOND

S# Status City

S1

S2

S3

S4

S5

20

10

10

20

30

London

Paris

Paris

London

Athens

SP

S# P# QTY

S1

S1

S1

S1

S1

S1

S2

S2

S3

S4

S4

S4

P1

P2

P3

P4

P5

P6

P1

P2

P2

P2

P4

P5

300

200

400

200

100

100

300

400

200

200

300

400

Fig: Sample tabulations of SECOND and SP.

After building the tables as shown we overcome the difficulties of FIRST relation. Now we can easily do the operations on the tables. This is about first normal form.

SECOND NORMAL FORM:

DEFINITION: A relation R is in second normal form (2NF) if and only if it is in 1NF and every nonkey attribute is fully dependent on the primary key.

Relations SECOND and SP are both 2NF (the primary keys are S# and the combination (S#,P#), respectively). Relation FIRST is not in 2NF. A relation that is in first normal form and not in second can always be reduced to an equivalent collection of 2NF relations. The reduction consists of replacing the relations by suitable projections; the collections of these projections is equivalent to the original relations, in the sense that the original relation can

64

Page 65: DBMS

always be recovered by taking the natural join of these projections, so no information is lost in the process. In other words, the process is reversible.

In our example: SECOND and SP relations are projections of FIRST, and FIRST is the natural join of SECOND and SP over S#.

The reduction of FIRST to the pair (SECOND, SP) is an example of nonloss decomposition. In general, given a relation R with possibly composite attributes A, B, C satisfying the FD R.A R.B, R can always be “nonloss-decomposed” into its projections R1 (A, B) and R2 (A, C).Since no information is lost in the reduction process, any information that can be derived from the original structure can also be derived from the new structure. The converse is not true, however: The new structure may contain information (such as the fact that S5 is located in Athens) that could not be represented in the original. In the sense the new structure is a slightly more faithful reflection of the real world.

The SECOND /SP structure still causes problems, however. Relation SP is satisfactory; as a matter of fact, relation SP is now in the normal form, and we shall ignore it for the reminder of this section. Relation SECOND, on the other hand, still suffers from a lack of mutual independence among its nonkey attributes. The dependence diagram for SECOND is still more complex than a 3NF diagram. To be specific, the dependency of the STATUS on S#, thought it is functional, is transitive (via CITY): Each S# value determines a CITY value, and this in returns determines the STATUS value. This transitivity leads, once again, to difficulties over update operations. (We now concentrate on the association between cities and status values-ie.,on the functional dependency of STATUS on CITY .)

INSERTING: We cannot enter the fact that a particular city has a particular status value-for example, we cannot state that any supplier in Rome must have a status of 50-until we have some supplier located in that city. The reason is, again, that until such a supplier exists we have no appropriate primary key value.

DELETING: If we delete the only SECOND tuple for a particular city, we destroy not only the information for the supplier concerned but also the information that that the city has that particular status value. For example, if we delete the SECOND tuple for S5, we lose the information that the status for the Athens is 30.

UPDATING:The status value for a given city appears in SECOND many times.Thus,if we need to change the status value for London from 20 to 30 we are faced with either the problem of searching the SECOND relation to find every tuple for London or the possibilbity of producing an inconsistent result.

The solution to the problems is to replace the original relation (SECOND) by two projections SC(S#,CITY) and CS(CITY,STATUS).And the corresponding functional dependency diagram is shown here.

65

S# CITY CITY STATUS

Page 66: DBMS

The tabulations corresponding to these is

SC

CS---

Fig:2 Sample tabulations of SC and CS.

It should be clear that this new structure overcomes all the problems over update operations concerning the CITY-STATUS association.

Third Normal Form

Definition: A relation R is in third normal form (3NF) if and only if is in 2NF and every non-key attribute is non-transitively dependent on the primary key.

Relations SC and CS (shown in Fig:2)are both 3NF;relation SECOND (shown in page 20)is not in 3NF.A relation that is not in second normal form and not in third can always be reduced to an equivalent collection of 3NF relations.

Relations with more than one candidate key or BCNF (Boyce-codd normal form)

Definition:

A relation R is in BCNF if and only if every determinant is a candidate key.

The objective of BCNF is to handle a relation having two or more composite and overlapping candidate keys. Although BCNF is stronger than 3NF,it is still true that any relation can be decomposed in a non-less way into an equivalent collection of BCNF relations.

Relation FIRST consists of three determinants: S#, CITY and the combination (S#, P#). Among these (S#, P#) alone is a candidate key; hence FIRST is not in BCNF.

Relation SECOND is also not in BCNF because the determinant CITY is not a candidate key.

S# City

S1

S2

S3

S4

S5

London

Paris

Paris

London

Athens

City Status

Athens

London

Paris

30

20

10

66

Page 67: DBMS

Relations SP, SC and CS are in BCNF because in each case the primary key is the only determinant in the relation.

Example: involving two disjoint (non-overlapping) candidate keys. Let us consider relation S (S#, SNAME, STATUS, CITY) .the relation S is BCNF.However, it is desirable to specify both keys in the definition of the relation:

a) To inform the DBMS, so that it may enforce the constraints implied by the two-way dependency between the two keys-namely, that corresponding to each supplier number there exists a unique supplier name, and conversely

b) To inform the users, since of course the uniqueness of the two attributes is an aspect of the semantics of the relation and is therefore of interest to people using it.

Example -where the candidate keys overlap.Two candidate keys overlap if they involve two or more attributes

each and have an attribute in common.

1) We suppose that the supplier names are unique, and we consider the relation SSP (S#, SNAME, P#, QTY). The keys are (S#, P#) and (SNAME, P#). This is relation is not in BCNF because we have two determinants# and SNAME, which are not keys for the relation (S# determines SNAME, and conversely). But the relation is in 3NF if we consider the definition----A relation R is in 3NF if and only if it is in 2NF and every non-key attribute is non-transitively dependent on the primary key. Here in this definition it does not require an attribute to be fully dependent on the primary key if it was itself a component of some other key in the relation, and so the fact that SNAME is not fully dependent on (S#, P#). But this fact leads to redundancy and hence to update problems in the relation SSP.If we go for updating the name of supplier S from Smith to Robinson leads either to search problems or to possibly inconsistent results. The solution to the problems as usual is to decompose the relation SSP into two projections, in this case SS (S#, SNAME) and SP (S#, P#, QTY) for SP (SNAME,P#,QTY).These projections are both BCNF.

2) Second example;Consider the relation SJT with attributes S(student),J(subject) and T(teacher).The meaning of an SJT tuple is that the specified student is taught the specified subject by the specified teacher. The semantic rules follow:

1.Only one teacher teaches each student of thet subject2.Each teacher teaches only one subject3.Several tachers teach each subject.

The sample tabulation of this relation is as follows

67

Page 68: DBMS

SJT

S J TSmithSmithJonesJones

MathPhysicsMathPhysics

Prof.whiteProf.GreenProf.WhiteProf.Brown

The functional dependencies of SJT are:From the first semantic rule we have functional dependency of T on the composite attributes (S, J).Form the second semantic rule we have a functional dependency of J on T.From the third semantic rule it is understood that there is no functional dependency of T on J.So the diagram is as follows

Fig: Functional dependencies in the relation SJT.

Here again we are having two overlapping candidate keys: the combination (S, J) and the combination (S, T). Once again the relation is 3NF and not BCNF; and once again the relation suffers from certain anomalies in connection with update operations. For example, if we wish to delete the information that Jones is studying physics, we cannot do so without at the same time losing information that professor Brown teaches physics.

The difficulties are caused by the fact that T is determinant but not a candidate key. Again we can get over the problem by replacing the original relation by two BCNF projections, in this case ST (S, T) and T, J (T, J).

Finally we say that the concept of BCNF eliminates certain problem cases that could occur under the old definition of 3NF.Moreover,BCNF is conceptually simpler than 3NF,in that it involves no reference to the concepts of primary key, transitive dependence and full dependence. The reference of candidate keys can also be replaced by a reference to the more fundamental notion of functional dependence. The reference to candidate keys can also be replaced by a reference to the more fundamental notion of functional dependence.

68

S

J

T

Page 69: DBMS

Good and Bad decompositions

During the reduction process it is frequently the case that a given relation can be decomposed in a variety of different ways. Consider the relation SECOND (S#, STATUS, CITY) with functional dependencies (FDs).

SECOND.S#SECOND.CITYSECOND.CITYSECOND.STATUS

And therefore by transitivitySECOND.S#SECOND.STATUS

The representation of SECOND relation is

Fig: Functional dependencies in relations S, P, SP

The above diagram clearly states that the update problems encountered with SECOND could be overcome by replacing it by its decomposition into the two 3NF projections

SC (S#, CITY) and CS (CITY, STATUS)------------------ALet this composition be A.

An alternative decomposition is SC (S#, CITY) and SS (S#, STATUS)---------------------------BDecomposition B is also nonloss, and the two projections are again

BCNF.But decomposition B is less satisfactory than decomposition A.

69

S#

SNAME

STATUS

CITY

P#

PNAME

COLOR

WEIGHT

CITY

S#

P#

QTY

Page 70: DBMS

For example, it is still not possible (in B) to insert the fact that a particular city has a particular status value unless supplier is located in that city. The explanation of this example is as follows:

In decomposition A the two projections are independent of each other, in the sense that updates can be made to either one without regard for the other; So joining them will not violate the FD constraints on SECOND.

In decomposition B updates to either of the two projections must be monitored to ensure that the FD SECOND.CITYSECOND.STATUS is not violated. Thus projections SC and SS are not independent of each other.

A relation that cannot be decomposed into independent component is said to be atomic.

Questions:

1.What is embedded SQL?2.Define QBE.3.Explain operations involving cursors and not involving cursors.4.What do you meant by dynamic statements?5.Explain retrieval operations of QBE.6.Explain update operations of QBE.7.Explain built-in functions of QBE.8.Define Normalization.9.What are various forms of normalization?10.What do you meant by QBE dictionary?11.Explain first, second and third normal forms.12.Explain relations with more than one candidate keys [BCNF].13.what do you meant by good and bad decomposition?14.What are QBE-aggregate functions?15.What is functional dependency?

STUDY MATERIAL

Course: B.Com CASubject: Data base management systemSemester:III

Unit: Four

70

Page 71: DBMS

Unit IV Syllabus

Hierarchical Approach:IMS data structure. Physical database, database description, Hierarhical sequence. External level of IMS: Logical Databases, the program communication block. IMS data manipulation: Defining the program communication block: DL/I Examples.

Books for Reference:

An introduction to database system - C.J.Date

Database system Concepts - Abraham silberschatz, Henry F.Korth, S.Sudharsan

Principles of database system -Aho D.Ullman

IMS data structure(Information Management System)

A physical database is an ordered set, the elements of which consist of all occurrences of one type of physical database record(PDBR).A PDBR occurrences in turn consists of a hierarchical arrangement of fixed-length segment occurrences; and a segment occurrence consists of a set of associated fixed-length field occurrences.

As an example we consider a PDB that contains information about the internal education system of a large industrial company. The hierarchical structure of this PDB-that is the PDBR type is shown here

Course

Prereq Offering

TeacherStudent

Fig: PDBR type for the education database.

In this example we are assuming that the company maintains an education department whose function is to run a number of training courses. Each course is offered

Course# Title Description

Course# Title Date Location Format

Emp# Name Emp# Name Grade

71

Page 72: DBMS

at a number of different locations within the company. The PDB contains details both of offerings already given and of offerings scheduled to be in the future,. The details are as follows:

For each course: course number (unique), course title, course description, details of prerequisites courses if any, and details of all offerings.

For each prerequisite course for a given course: course number and title. For each offering of a given course: date, location, format, details of all

teachers and details of all students; For each teacher of a given offering: employee number and name For each student of a given offerings: (EMP_N), name and grade.

In the PDBR structure shown, we have five types of sgments:

COURSE, PREREQ, OFFERING, TEACHER and STUDENT, each one consisting of the field types indicated.

COURSE is the root segment type and the others are department segment types. Each dependent has a parent for example the parent of TEACHER is OFFERING. Similarly each parent has at least one child, for example COURSE has two children. For one occurrence of any given segment type may be any number occurrences of each of its child segment types.

Course

Prereq Offering

StudentTeacher

Fig: Sample PDBR Occurrence for the education database.

M23 Dynamics …

M19 CalculusM16 Trignomentry

750106 Oslo F2751104 Dublin F3730813 Madrid F3

421633 Sharp.R761620 Tallis.T B183009 Gibbons.O A102141 Byrd,W B

72

Page 73: DBMS

The database Description

Each physical database is defined together with its mapping to storage by a database description (DBD). The source form of the DBD is written using special System/370 Assembler language macro statements, once written the DBD is assembled and the object form is stored away in a system library, from which it may be extracted when required by the IMS control program. So the following is the DBD for the education database.

1 DBD NAME=EDUCPDBD2 SEGM NAME=COURSE, BYTES=2563 FIELD NAME=(COURSE#, SEQ), BYTES=3,START=14 FIELD NAME=TITLE, BYTES=33,START=45 FIELD NAME=DESCRIPN, BYTES=220,START=376 SEGM NAME=PREREQ, PARENT=COURSE, BYTES=367 FIELD NAME=(COURSE#, SEQ), BYTES=3,START=18 FIELD NAME=TITLE, BYTES=33,START=49 SEGM NAME=OFFERING, PARENT=COURSE, BYTES=2010 FIELD NAME=(DATE, SEQ, M), BYTES=12,START111 FIELD NAME=LOCATION, BYTES=12,START=1912 FIELD NAME=FORMAT, BYTES=2,START=1913 SEGM NAME=TEACHER,PARENT=OFFERING,BYTES=2414 FIELD NAME=(EMP#, SEQ), BYTES=6,START=715 FIELD NAME=NAME, BYTES=18,START=716 SEGM NAME=STUDENT,PARENT=OFFERING, BYTES=2517 FIELD NAME=(EMP#, SEQ), BYTES=18MSTART=718 FIELD NAME=NAME, BYTES=18,START=719 FIELD NAME=GRADE, BYTES=1,START=25

FIG: DBD for the education PDB.

Explanation

Statement 1:Assigns the name EDUCPDBD (“education physical database

description”) to the DBD.All the names in IMS are limited to a maximum length of eight characters.

Statement 2:Defines the root segment type with the name COURSE and has totally 256 bytes length.

Statement 3-5:Defines the field types that go to make up COURSE. Each is given a name, a length in bytes, and a start position within the segment. The first field, COURSE# is defined to be the sequence field for the segment. So the PDBR occurrences will be sequenced in ascending course number order.

73

Page 74: DBMS

Statement 6:Defines PREREQ as a 36-byte segment and is dependent on COURSE.

Statements 7-8:Define the fields of PREREQ.

Statement 9:Defines OFFERING as a child of COURSE.

Statements 10-12:Define the fields of OFFERING.DATE are defined as the sequence field for OFFERING. The specification M (multiple) means that twin OFERING occurrences may contain the same date value.

Statements 13-15:Define the TEACHER segment and its fields

Statements 16-19:Define the STUDENT segment and its fields

The sequence of statements in the DBD is significant. Specifically SEGM statements must appear in the sequence that reflects the hierarchical structure also each SEGM statement must be immediately followed by the appropriate FIELD statements.

Hierarchical Sequence

The concept of hierarchical sequence within a database is a very important one in IMS.The definition for this is as follows:

For each segment occurrence, we define the “hierarchical sequence key value” to consist of the sequence field value for that segment, prefixed with the type code for that segment, prefixed with the hierarchical sequence key value of its parent, if any. For example, the hierarchical sequence key value for the STUDENT occurrence for “Byrd,W.” is

1M2337308135102141

Here 1 is the type code for COURSE, M23 the course#, 3 is the type code of OFFERING, 730813 is the DATE of OFFERING, 5 is the type code of STUDENT, 102141 is the EMP# of STUDENT.

Then the hierarchical sequence for an IMS database is that sequence of segment occurrences defined by ascending values of the hierarchical sequence key. This notion is important in case of IMS databases because in IMS databases are stored in hierarchical sequence.

External Level OF IMS

74

Page 75: DBMS

Logical databases:In architecture the user’s external view was defined as subset of

the corresponding physical database. A LDB (logical database) is an ordered set, the elements of which consist of all occurrences of one type of LDBR (logical database record).An LDBR type is a hierarchical arrangement of segment types, and is derived from the corresponding PDBR hierarchy in accordance with the following rules.

Any segment type of the PDBR hierarchy together with all its dependents can be omitted from the LDBR hierarchy

The fields of an LDBR segment type can be a subset of those of the corresponding PDBR segment type, and can be rearranged within that LDBR segment type.

Example:

Course

Offering

Student

Fig: Sample LDBR type for the education database.

Sensitive Segments:

The segments, which are present in PDB and is included in LDB are said to be sensitive segments. In the above example COURSE, STUDENT, OFFERING are sensitive segments .The user of this LDB will not be aware of the existence of any other segments.

For example, the DL/I “get next” operation, which in general is used for sequential retrieval, will simply skip over any segments that are not sensitive for the user. If the user deletes a sensitive segment all children of that segment will be deleted regardless of sensitiveness. So the user should not be given the authority to delete a segment, which allows the deletion of other hidden segments too.

Course# Title Description

Date Location Format

Emp# Name Grade

75

Page 76: DBMS

Also sensitive-segment concept protects the user from modification like addition to the PDB unless it is proved that the addition of new segment may not affect any existing parent-child relationship.

Also sensitive-segment concept provides a degree of control over data security, is as much as users can be prevented from accessing particular segment types by the omission of those segments from the LDB.

Sensitive fields

Sensitive fields are those fields of the PDB that are included in the LDB.Every sensitive field must be controlled within a sensitive segment A given LDB may include or exclude any combination of fields from the PDB, in general except that if the program intends to insert new occurrences of a given segment type, then it must be “sensitive to” the sequence filed for that segment type.

Field sensitivity, like segment sensitivity, protects the user from certain types of growth in the database and provides a simple level of data security.

The program communication block (PCB)

Each LDB is defined by a PDB.The PCB includes the specification of the mapping between the LDB and the corresponding PDB.Like DBD (database description) a PCB is written using special system/370 assembler language macro statements. These statements constitute the “external DDL”for IMS.The set of all PCBs for a given user forms that user’s program specification block (PSB); the object form of the PSB is stored in a system library, from which it may be extracted when required by the IMS control program.

Example:

1 PCB TYPE=DB,DBNAME=EDUCPDBD,KEYLEN=152 SENSEG NAME=COURSE, PROCOPT=G3 SENSEG NAME=OFFERING,PARENT=COURSE,PROCOPT=G4 SENSEG NAME=STUDENT,PARENT=OFFERING, PROCOPT=G

Fig: PCB for the LDB

Explanation

Statement 1:Specifies that this is a PCB database and named as EDUCPDBD, length of the key feedback area is 15 bytes.

Key Feedback: When the user accesses an LDB, the corresponding PCB is held in storage and acts, as a communication area between the user’s program and

76

Page 77: DBMS

IMS.One of the fields in the PCB is the key feedback area. When the user retrieves a segment from the LDB, IMS not only fetches the requested segment but also places a “fully concatenated key” into the key feedback area. The fully concatenated key consists of the concatenation of the sequence field values of all segments in the hierarchical path from the root down to the retrieved segment.Fetches the requested segment

For example;Retrieve the STUDENT occurrence for

Byrd.W.

IMS will place the value M23730813102141 in the key feedback area. The fully concatenated key of a segment is not quite the same as the “hierarchical sequence key” as this does not include segment type code information.

Statement 2:Specifies the first sensitive segment in the LDB.The name of the sensitive segment must be same as the name assigned to the segment in the DBD.

The PROCOPT (processing options”) entry specifies the types of operation that the user will be permitted to perform on this segment. In this example the entry is G (“get”) indicating retrieval only. Other options are I (“insert”), R (“replace”) and D (“delete”).

Statement 3:Defines the next sensitive segments in the LDB. Statement 4:Defines the last sensitive segments. In our example

statements 3 and 4 are very similar. The PROCOPT entry is the same for each of the three sensitive segments .In such a situation we may specify PROCOPT in the PCB statement instead of in each SENSEG statement.

If PROCOPT=K is specified in the SENSEG statement for OFFERING, the user may largely ignore the presence of OFFERINGs in the hierarchy. The output for this modification is shown as follows.

Course

Course# Title Description

77

Page 78: DBMS

Student

Fig: Effect of specifying PROCOPT=K for offering

The main difference is that when a STUDENT occurrence is retrieved, the fully concatenated key in the key feedback area will include the date value from the parent OFFERING.

The LDB shown in the example figure 1, is sensitive to all fields in segments COURSE, OFFERING and STUDENT of the underlying PDB.Suppose if we wish to exclude the LOCATION field of the OFFERING segment from the LDB while still remaining sensitive still all other fields as shown here:

SENFLD NAME=FORMAT, START=1 SENFLD NAME=DATE, START=1

These statements specify the fields to be included in the LDB segment and their start position within that segment. If no SENFLD statement is given for a particular SENSEG statement, then by default that segment is taken to be identical to the underlying PDB segment.

IMS Data Manipulation

Defining the Program Communication Block (PCB)

The IMS data manipulation language (DL/I) is invoked from the host language (PL/I) by means of ordinary subroutine calls. When an application program is operating on a particular logical database (LDB), the PCB for that LDB is kept in storage to serve as a communication area between the programs and IMS; infact when the program calls DL/I, it has to quote the storage address of the appropriate PCB to identify to DL/I which LDB it is to operate on.

PCB address is supplied to the program by IMS when the program is first entered. what actually happens is this.when a database application is to be run, IMS is given control first. IMS determines which PSB and DBD(s) are required, fetches them from their respective libraries and loads them into storage. IMS then

Emp# Name Grade

78

Page 79: DBMS

fetches the application program and gives it control, passing it the PCB address as parameters.

In order for the application program to be able to access the information in the PCB for a particular LDB, it must contain a definition of that PCB.

DLITPLI: PROCEDURE (COSPCB_ADDR) OPTIONS (MAIN);...

Declare 1 COSPCB BASED(COSPCB_ADDR), 2 DBDNAME CHARACTER(8),

2 SEGLEVEL CHARACTER(2),2 STATUS CHARACTER(2),2 PROCOPT CHARACTER(4),2 RESERVED FIXED BINARY(31),2 SEGNAME CHARACTER(8),2 KEYFBLEN FIXED BINARY(31),2 #SENSEGS FIXED BINARY(31),2 KEYFBAREA CHARACTER(15);

Fig A: Example of program entry and PCB definition (PL/I).

Explanation:

The procedure statement (labeled DLITPLI) is the program entry point. the expression in parentheses following the keyword PROCEDURE represents the parameters to be passed to the program by IMS, it consist of the pointer giving the address of the PCB. The rest of the Fig A consist of a declare statement that defines a structure to represent the single PCB used in the application.

The field DBDNAME contains the name of the underlying DBD throughout the execution of the program.

The SEGLEVEL field is set after the DL/I operation to contain the segment level number of the segment just accessed.

The STATUS field is the most important field in the PCB. After each DL/I call, the two character value is placed in this field to indicate the success or otherwise of the requested operation. A blank value indicates that the operation was completed satisfactorily, any other value represents an exceptional or error condition.

The PROCOPT field contains the PROCOPT value as specified in the PCB statement when the PCB was originally defined.

The SEGNAME field contains the name if the segment last accessed.The KEYFBLEN field contains the length of the fully concatenated key.

79

Page 80: DBMS

The #SENSEGS field contains a count of the number of sensitive segments.

The field KEYFBAREA is the key feedback area contains the fully concatenated key.

DL/I Examples

Get Unique (GU) Direct retrievalGet next (GN) Sequential retrievalGet next with parent (GNP) Sequential retrieval under current parentGet hold (GHU), (GHN),(GHNP) Allows subsequent DLET/REPLInsert (ISRT) Add new segment occurrenceDelete (DLET) Delete existing segment occurrenceReplace (REPL) Replace existing segment occurrence

Tab: DL/I Operations

Direct retrieval: Get the first OFFERING occurrence where the location is Stockholm.

GU COURSEOFFERING (LOCATION =’STOCKHOLM’)

Sequential retrieval with an SSA:Get all STUDENT occurrences in the LDB, starting with the first student for the

first offering in Stockholm.

GU COURSEOFFERING (LOCATION=’STOCKHOLM’)STUDENT

NS GN STUDENTGOTO NS

Sequential retrieval with an SSA within a parent:Get all students for the offering on 13 august 1973 of course M23.

GU COURSE (COURSE#=’M23’)OFFERING (DATE=’730813’)

80

Page 81: DBMS

NP GNP STUDENTGOTO NP

Segment occurrence insertion:Add a new segment occurrence for the offering on 13 august 1973 of course M23.

ISRT COURSE (COURSE#=’M23’)OFFERING (DATE=’730813’)STUDENT

Segment deletion:Delete the offering of course M23 on aug 1973.

GHU COURSE (COURSE# = ‘M23’)OFFERING (DATE=’730813’)

DLET

Segment replacement:Change the location of the 13 Aug 1973 offering of course M23 to Helsinki.

GHU COURSE (COUSE# =’M23’)OFFERING (DATE=’730813’)

REPL

Questions.1. Explain physical and logical database of hierarchical approach with example.2. Explain DataBase Description (DBD) with example.3. Explain Hierarchical sequence key value.4. Explain Program communication block (PCB).5. Discuss DL/I operations with some examples.

STUDY MATERIALCourse : B.Com CASubject : Data base management systemSemester :III

Unit : Five

UNIT-V

Syllabus

Network approach: Architecture of DBTG system. DBTG data structure: The set construct, singular sets, sample schema, and the external level of DBTG-DBTG Data manipulation

81

Page 82: DBMS

Books for reference:

1:Database system conceptsAbraham Silberschatz and Henry F.Korth

2:An introduction to database systemsC.J.Date

Basic concepts:

A network database consists of a collection of records, which are connected to one another through links. A record is in many respects similar to an entity in the entity-relationship model. Each record is a collection of fields (attributes), each of which contains only one value. A link can be viewed as a restricted (binary) form of relationship in the sense of the E-R model.

To illustrate, consider a database representing a customer-account relationship in a banking system. There are two record types, customer and account. As we saw earlier, the customer record type can be defined, using Pascal-like notation, as follows:

type customer = recordname: string;street: string;city: string;

end

The account record type can be defined as follows:

type account = recordnumber: integer;balance: integer;end

The sample database in figure A.1 shows that Lowman has account 305, Camp has accounts 226 and 177, and kahn has account 155.

Lowman Square Dallas 305 500

Camp Downridge Garland

82

226 336

177 205

155 62

Page 83: DBMS

Fig:1Sample database

Data-structure diagrams: [Architecture of network model]

A data-structure diagram is the scheme representing the design of a network database. Such a diagram consists of two basic components: *Boxes, which correspond to record types. *Lines, which correspond to links.

A data-structure diagram serves the same purpose as an entity-relationship diagram; namely, it specifies the overall logical structure of the database. We shall consider the representation of binary, ternary etc. relationships of entity-relationship diagrams.

Binary relationship

The entity-relationship diagram for banking example is shown as follows:

E-R diagram (a)

(b)

FIG:2The above shown diagram (a) is the entity-relationship diagram and consists of

two entity-sets customer and account, and they are related through a binary ‘many-to-many’ relationship ‘custacct’ with no descriptive attributes.

The diagram shows that a customer may have several accounts and that an account may belong to several different customers. The corresponding data-

83

customer accountCustAcct

Number

BalanceStreet

CityName

Name street city Number balance

Page 84: DBMS

structure diagram is shown in figure (b). Here the record type customer corresponds to the entity set customer. It includes three fields-name, street and city.

Similarly, account is the record type corresponding to account entity-set and includes the attributes number and balance. Since, in the E-R diagram of above figure the CustAcct relationship is many-to-many, we draw no arrows on the link CustAcct diagram. If the relationship custacct were one-to-many from customer to account then the link custacct would have an arrow pointing to customer record type. The representation is shown as follows:

Customer account

(a)

Customer account

FIG:3

A sample database corresponding to the data-structure diagram of figure as shown. Since the relation is many-to-many, we show that katz has accounts 256 and 347 and that account 347 is owned by katz and Doner. A sample database corresponding to the data-structure diagram is shown here:

Fig:4Sample database corresponding t diagram of FIG:3a

Since the relationship is one-to-many -------

84

Beck Maple San Francisco 200 55

Katz North San jose256 100 000

347 667

Doner Sidehill Palo Alto 301 10 533

name street city number balance

name street city number balance

Page 85: DBMS

From customer to account, a customer may have more than one account, as is the case with Camp, who owns both 226 and 177. An account, however, cannot belong to more than one customer, as is indeed observed in the sample database. Finally, a sample database corresponding to the data-structure diagram of fig:3b is shown in the FIG:1.

How to replace the E-R diagram shown in FIG:2a if the descriptive attribute has to be included?

The transformation is more complicated because the link cannot contain any data value.So new record type has to be created and links need to be established as follows:

If for example we consider the E-R diagram shown in FIG:2a and we are trying to add the descriptive attribute date to the custacct relationship to denote the last time the customer has accessed the account.The newly derived E-R diagram is shown here

To transform this diagram to a data-structure diagram we need to:1:Replace entities customer and account with record types customer and account2:Create a new record type date with a single field to represent the date.3:Create the following many-to-one links:

*custdate from the date record type to the customer record type*acctdate from the date record type to the account recotd type

The DBTG CODASYL ModelThe Database Task Group wrote the first database standard specification, called

the CODASYL DBTG 1971 report, in the late 1960s. Then a number of changes have been suggested to that report, the last official one in 1978.The rules or standards advised by DBTG group are

Link restrictionDBTG SetsRepeating Groups

Link Restriction

In the DBTG model, only many-to-one links can be used. Many-to-many links are disallowed in order to simplify the implementation. One-to-one links are represented using a many-to-one link. Let us illustrate this with the help of an example:

85

Page 86: DBMS

Consider a binary relationship that is either one-to-many or one-to-one. If for our customer-account database, if the custacct relationship is one-to-many with no descriptive attributes and with descriptive attribute is shown in the following figure:

Customer account

Customer account

Fig: Two data-structure diagrams

If the custacct relationship is many-to-many then our transformation algorithm must be refined as follows. If the relationships have no descriptive attributes then the following algorithm must be employed:

1:Replace the entity sets customer and account with record types customer and account.2:Create a new dummy record type Rlink that may either have no fields or have a single field containing an externally defined unique identifier.3:Create the following two many-to-one links:

custrlink from rlink record type to customer record type*acctlink from record type to account record type.

D

DBTG sets

86

Name Street City

Number Balance

Name Street City

Number Balance

Date

Customer AccountcustAcct

name

street

Citynumber

Balance

Page 87: DBMS

Given that only many-to-one links can be used in the DBTG model, a data-structure diagram consisting of two record types that are linked together has the general form of the following figure:

Fig:AThe above shown structure is referred in the DBTG model as a DBTG-set. The name of the set is usually chosen to be the same as the name of the link connecting the two record types.

In each such DBTG-set, the record type A is said as the owner (or parent) of the set, and the record type B is said as the member (or child) of the set. Each DBTG-set can have any number of set occurrences-that is actual instances of linked records.

For example in the figure we are having three occurrences corresponding to the DBTG-set of figure A.

Since many-to-many links are disallowed, each set occurrence has precisely one owner and zero or more member records. In addition, no member record of a set can participate. Simultaneoulsy in several set occurrences of different DBTG-sets.

To illustrate, consider the data-structure diagram shown here. There are two DBTG-sets.

Custacct, having customer as the owner of the DBTG-set, and account as the member of the DBTG-set. Brncacct, having branch as the owner of the DBTG-set, and account as the member of the DBTG-set.The set custacct may be defined as follows:

87

Name street city Numberbalance

B

A

Page 88: DBMS

Set name is custacct Owner is customer Member is account

The set brncacct may be defined similarly asSet name is brncacct Owner is branch Member is account

An instance of the database is shown here:

Five set occurences are shown: three of set custacct,and two of set brncacct

1:owneer is customer record Lowman with a singke member account record 3052:owner is customer record Camp with two member account records 177 and 2263:Owner is cuatomer record Kahn with three member account records 155,402 and 408.4:Owner is branch record Hillside with three member account records 305,226 and 155.5:Owner is branch record Valleyview with three member account records 177,402 and 408

Here the fact, an account record cannot appear in more than one set occurrence of one individual set type. This is because an account can belong to exactly one customer, and can be associated with only one bank branch. An account can appear in two set occurrences of different set types. For example, acccount 305 is a member of set

occurrence 1 of type custacct and is also a member of set occurrence 4 of type brncacct.

The member records of a set occurrence may be ordered in a variety of ways.

Repeating Groups:

The DBTG model provides a mechanism for a field to have a set of values, rather than one single value.

For example, Suppose that a customer have several addresses. In this case, the customer record type will have the (street, city) pair of fields is defined as repeating group. So the customer record for Kahn is shown here:

88

Page 89: DBMS

The repeating groups construct is another way of representing the notion of weak entities in the E-R model. To illustrate we shall split the entity set customer into two sets:

*Customer, with descriptive attribute name*Address, with descriptive attribute street and city.

The address entity set is weak entity set, since it depends on the strong entity set customer.

DBTG data retrieval facility

The data manipulation language of the DBTG proposal consists of a number of commands that are embedded in a host language. The commands are explained as follows:

The Find and Get commands

The two most frequently used DBTG commands are

*find-locates a record in the database and sets the appropriate currency pointers*get,which copies the record to which the current of run-unit points from the database to the appropriate program work area template.

Access of individual records:

The find command has a number of forms. There are two different find commands for locating individual records in the database. the simplest command has the form:

Find any <record type> using <record-field>

Purpose: Locates a record of type <record type> whose <record-field> value is the same as the value of <record-field> in the <record-type> template in the program work-area. The following currency pointers are set to point to that record:

*The currency of run-unit pointer*The record-type currency pointer for <record type>*For each set in which that record belongs, the appropriate set currency pointer

For example: Construct the DBTG query that prints the street address of Lowman.

89

Page 90: DBMS

Customer. name:=”Lowman”;Find any customer-using name;

Get customer;Print (customer.street);

To display the duplicate records the command is

Find duplicate <record type> using <record-field>

Which locates the next record, which matches the <record-field>.

Example: Construct the DBTG-query that prints the names of all the customers who live in Dallas:

Customer.city:=”Dallas”;Find any customer-using city;

While DB-status = 0 do Begin

Get customer;Print(customer.name);Find duplicate customer using city;

End;

Access of records within a set

Purpose: Locate records in a particular DBTG-set.

There are three different types of commands.

The basic find command is

Find first <record type> within <set-type>

Which locates the first database record of type <record type> belonging to the current <set-type>.

To locate the other members of a set the command is

Find next <record-type> within <set-type>

90

Page 91: DBMS

This command finds the next elements in the set <set-type>

Example: Construct the DBTG query that prints the total balance of all accounts belonging to Lowman.

Sum: =0;Customer. name:=”Lowman”;Find any customer-using name;Find first account within custacct;While DB-status =0 doBegin

Get account;Sum:=sum + account. Balance;Find next account within custacct;

EndPrint (sum);

To find the owner of a particular DBTG-set .The command used is

Find owner within <set-type>

Example: Construct the DBTG-query that prints all the customers of the Hillside branch:

Branch-name:=”Hillside”;Find any branch-using name;Find first account within brncacct;While DB-status=0 doBegin

Find owner within custacct;Get customer;Print(customer. name);Find next account within brncacct;

End

DBTG update facility

Creating new records

To create a new record of type <record type> we insert the appropriate values in the corresponding <record type> template. And the command used is

91

Page 92: DBMS

Store <record type>

Example: Construct the DBTG query to add a new customer Jackson to the database.

Customer.name:=”Jackson”;Customer.street:=”Old road”;Customer.city:=”Richardson”;Store customer;

Modifying an existing record

In order to modify an existing record of type <record type> we must find the record in the database, get that record into the memory, and then change the desired fields in the template of <record type>. Once this is accomplished, we reflect the changes to the record to which the currency pointer of <record type> points by executing the command:

Modify <record type>

The DBTG model requires the find command to be executed prior to modifying a record must have the additional clause “for update” so that the system is aware of the fact that the record is to be modified.

Example:Construct the DBTG program to change the street address of Kahn to North Loop.

Customer.name:=”Kahn”;Find for update any customer using name;Get customer;Customer.city:=”North Loop”;Modify customer;

Deleting a record

To delete an existing record of type <record type> we use the command:

Erase <record type>

92

Page 93: DBMS

Example:The query to construct the DBTG program to delete account 402 belonging to

Kahn:

Finish:=false;Customer.name:=”Kahn”;Find any customer using name;Find for update first account within custacct;While DB-status=0 and not finish doBegin

Get account;If account.number =402 then BeginErase account;Finish: = true;End;ElseFind for update next account within custAcct

End;

It is possible to delete an entire set occurrence by finding the owner of the set – say, a record of type <record type> - and executing.

Erase all<record-type>

This will delete the owner of the set as well as its entire member. If a member of the set is an owner of another set the members of that set are also deleted. That the erase all operation is recursive.

Eg.Consider the DBTG program to delete customer “Camp” and all of her accounts.

Customer.name :=”Camp”;Find for update any customer using name;Erase all customer.

DBTG set-processing facility

This mainly concerns with the mechanism of inserting records into and removing records from a particular set occurrence.

The connect statement

To insert a new record of type <record type> into a particular occurrence of <set-type> we must first insert the record into the database, then set the currency pointers of <record type> and <set type> to point to the appropriate record and set occurrence.

93

Page 94: DBMS

The command used is

Connect <record type> to <set-type>

A new record can be inserted as follows:1:create a new record of type <record type> .2:Find the appropriate owner of the set <set type>.3:Insert the new record into the set by executing the connect statement.

Example:

Create the DBTG query for creating new account 267 which belongs to Jackson:

Account.number:=267;Account.balance:=0;Store account;Customer.name:=”Jackson”;Find any customer using name;Connect account to custacct;

The Disconnect statement

In order to remove a record of type <record type> from a set occurrence of <set-type>, we need to set the currency pointer of <record type> and <set-type> to point to the appropriate record and set occurrence. Once this is accomplished, the record can be removed from the set by executing

Disconnect <record-type> from <set-type>

Eg. To remove account 177 from the set occurrence of type custacct.

Account.number :=177;Find for update any account using number;Get account;Find owner within custacct;Disconnect account from custacct;

The reconnect statement

In order to move a record of type <record-type> from one set occurrence to another set occurrence of type <set-type>, we need to find the appropriate record and the owner of the set occurrence to which the record is to be moved. Once this is done, we can move the record by executing:

Reconnect <record-type> to <set-type>

94

Page 95: DBMS

Consider the DBTG program to move all accounts of Lowman that are currently at the hillside branch to the valley view branch.

Customer.name :=”Lowman”;Find any customer-using name;Find first account within custacct;While DB-status =0 do

BeginFind owner within brncacct;Getbranch;If branch.name = “hillside” thenBegin

Branch.name:=”Valley view”;Find any branch-using name;Reconnect account to brncacct;

End;Find next account within custacct;

End;

Set Insertion and RetentionWhen a new set is defined, we must specify how member records are to be

inserted. In addition, we must specify the conditions under which a record must be retained in the set occurrence in which it was initially inserted.

Set Insertion A newly created record of type <record type > of a set type <set type > can be

added to a set occurrence either explicitly (MANUALLY) or implicitly (automatically). This distinction is specified at set definition time via

Insertion is < insert mode >

Where < insert mode > can take one of two forms.

95

Page 96: DBMS

Manual : The new record can be inserted into the set manually ( explicitly ) by executing .

Connect < record type > to <set-type>

Automatic : The new record is inserted into the set automatically ( implicitly ) when it is created , that is , when we execute .

Store < record type >

In either case, just prior to insertion, the <set-type> currency pointer must point to the set occurrence into which the insertion is to be made.

Set Retention There are various restrictions on how and when a member record can be removed

from a set occurrence into which it has been inserted previously. These restrictions are specified at set definition time via

Retention is < retention-mode >

Where <retention-mode> can take one of the three forms

Fixed : Once a member record has been inserted into a particular set occurrence , it cannot be removed from that set . If retention is fixed , then to reconnect a record to another set , we must first erase that record , re-create it , and then insert it into the new set occurrence .Mandatory : Once a member record has been inserted into a particular set occurrence , it can be reconnected only to another set occurrence of type <set-type>. It can neither be disconnected nor be reconnected to a set of another type .Optional : No restrictions are placed on how and when a member record can be reconnected , disconnected ,and connected at will .The decision as to which to option to choose is dependent on the application .

Deletion

96

Page 97: DBMS

When a record is deleted (erased) and that record is the owner of set occurrence of type <set-type> , the best way of handling this deletion depends on the specification of the set retention of <set-type>

If the retention status is optional, then the record will be deleted and every member of the set it owns will be disconnected. These records, however, are kept in the database. If the retention status is fixed, then the record and all of its owned members will be deleted. This follows from the fact that the fixed status indicates that a member record cannot be removed from the set occurrence without being deleted.If the retention status is mandatory, then the record cannot be erased this is because the mandatory status indicates that a member record must belong to a set occurrence; it cannot be disconnected form that set.

Set Ordering The members of a set occurrence of <set-type> may be ordered in a variety of

ways. A programmer specifies these orders when the set is defined Order is <order-mode>

Where <order-mode> can be First : When a new record is added to a set , it is inserted in the first positive . Thus, the set is in reverse chronological ordering Last : When a new record is added to a set , it is inserted in the ;last position . Thus, the set is in chronological ordering Next : Suppose that the currency pointer of <set-type> points to record X . if X is a member type , then when a new record is added to the set . It is inserted in the position following X. If X is an owner type, then when a new record is added, it is inserted in the last position. Prior : Suppose that the currency pointer of ,set-type> points to record X . If X is a member type, then when a new record is added to the set it is inserted in the position just prior to X. If X is an owner type, then when a new record is added, it is inserted in the last position. System default : When a new record is added to a set , it is inserted in an arbitrary position determined by the system . Sorted : When a new record is added to a set , it is inserted in a position that ensures that the set will remain sorted . The sorting order is specified by a particular key value when a programmer defines the set. The programmer must specify whether members are ordered in ascending or descending order relative to that key.

97

Page 98: DBMS

REFER THE TEXT BOOK FOR FURTHER REFERENCE

Questions:1. Explain the architecture of network model.2. Write short notes on

a) Link restrictionb) DBTG Setsc) Repeating Groups

3. Explain DBTG data retrieval facility.4. Explain DBTG set-processing facility.5. explain DBTG update facility.6. What is set insertion and retention.

98