MANAGEMENT INFORMATION SYSTEMS A AND RELATIONAL...
Transcript of MANAGEMENT INFORMATION SYSTEMS A AND RELATIONAL...
MANAGEMENT INFORMATION SYSTEMS -
A COMPARISON OF THE NETWORK
AND RELATIONAL MODELS OF DATA
by
Charles Arthur Ziering, Jr.
B.S., Massachusetts Institute of Technology
(1973)
Submitted in Partial Fulfillment of the
Requirements for the Degree of
Master of Science
at the
Massachusetts Institute of Technology
June, 1975
/2
Signature of
Certified by
Accepted by
AUtnor -
Alfred P. Sloan School of ManiageenV May 9, 1975
Professor Stuart E. Madnick, Thesis Supervisor
-1-
Chairman, Departmental Committee on Graduate Students
MANAGEMENT INFORMATION SYSTEMS -
A COMPARISON OF THE NETWORK
AND RELATIONAL MODELS OF DATA
by
Charles Arthur Ziering, Jr.
Submitted to the Sloanpartial fulfillment ofMaster of Science.
School of Management Maythe requirements for the
9, 1975 inDegree of
ABSTRACT
This paper compares the twounderlying database management sproblems each is hoping to solvebackground. Five points of viewmanagement system are given, andin terms of the needs of each.calculus are introduced to allowmodels on the same level. Finalto handle distributed databases.
THESIS SUPERVISOR:
predominant models of dataystems. After explaining the, each model is discussed aswith respect to a databasethe two models are compared
A network algebra and networkfor comparison of the two
ly, a hybrid view is presented
Stuart E. Madnick
TITLE: Assistant Professor of Management
-2-
ACKNOWLEDGEMENTS
Many of the ideasof the relative meritsGrant N. Smith. Thisarguments.
As usual, the incitefulProfessor Stuart E. Madnickthat much of the credit fortation is due him.
presented here grewof the two models, p
forum was invaluable
comments of and diwere of immense helany success found i
out of "discussions"rimarily within clarifying the
scuss ip. In the
ons withsuspectpresen-
My knowledge and understanding of thecomes largely from my experience at MITROL,discussions there with Jeffrey P. Stamen.
When the crunch came totime, the help of my typist,appreciated.
network modelInc., and
get the manuscript prepared inNorma Robinson, was greatly
-3-
Table of Contents
Page
1. IntroductionThe Use of Computers . . . . . . . . . 7Current Problem Areas . . . . . . . . 10The Reason for the Problems . . . . . 13A Logical View of Data . . . . . . . . 15History . . . . . . . . . . . . . . . 17The Current Conflict . . . . . . . . . 20Presentation Strategy . . . . . . . . 23
2. The Network Model of DataIntroduction . . . . . . . . . . . . . 25Data versus Information . . . . . . . 25Entities, Attributes, and Keys . . . . 26Entity Classes . . . . . . . . . . . . 27Associations Between Entities . . . . 28Standard Terminology . . . . . . . . . 36Data Structure Diagrams . . . . . . . 36Trees and Networks . . . . . . . . . . 37Relation Naming . . . . . . . . . . . 40
3. The Relational Model of DataIntroduction . . . . . . . . . . . . . 43Goals . . . . . . . . . . . . . . . . 43The Model . . . . . . . . . . . . . . 46First Normal Form . . . . . . . . . . 47Operations of Relations . . . . . . . 50Second and Third Normal Form ..... 54Algebra and Calculus . . . . . . . . . 57
4. Viewpoints for ComparisonIntroduction . . . . . ........ 59Database Administrator/Designer . . . 60A Framework for Language Interface
Comparison . . . . . ........ 62Experienced User . . . ........ 66Casual User . . . . . ........ 68The System . . . . . . ........ 69Application Programs . . . . . . . . . 72
-4-
Table of Contents (cont'd)
Page
5. A Network AlgebraIntroduction . . . . . . . . . . . .Access Path Navigation . . . . . . .Extension to the Relational AlgebraExamples . . . . . . . . . . . . . .
6. A Network CalculusIntroduction . . . . . . . .The Extension . . . . . . .The Network Calculus . . . .Examples . . . . . . . . . .Reduction . . . . . . . . .Optimization and Efficiency
7. SummaryConclusions . . . . . . . .A Hybrid View . . . . . . .Further Research . . . . . .
FootnotesBibliography
-5-
. . . . . 80. . . 80
. .80. . . . . 84. . . . . 86
87
. . . . . 89
. . . . . 92
. . . . . 93
96100
List of Figures
1. company/employee data structure diagram . . . 362. teacher/student/pupil data structure diagram 383. Part/product structure data structure diagram 394. husband/wife data structure diagram . . . . . 395. part/product structure data structure diagram
with up and down names . . . . ... .. . . 426. Department/employee information in the
network and relational frameworks . . . . . 497. Company/employee data structure diagram . . . 558. Comparison framework . . . . . . . . . . . . 609. Language Comparison Hierarchary . . . . . . . 63
10. Current language comparison status . . . . . 6511. Part/Vendor/PO/Line-item data structure
diagram . . . . . . . . . . . . . . . .. . 6712. Part/Product structure data structure diagram 7013. Supplier/Project/supply data structure
diagarm . . . . . . . . . . . . . . . . . . 7414. Supplier/Part/Project/Supply/Worker Data
structure diagram . . . . . . . . . . . . . 7715. Supplier/Part/Project/Supply/Worker Data
structure diagram . . . . . . . . . . . . . 84
-6-
Chapter 1 Introduction
The Use of Comouters
During the last quarter
has increased dramatically.
had a strong impact in many
pation of even greater futur
with this growth, the nature
since its inception, and we
near future.
The first computer user
who needed greater numerical
was available at the time.
these first machines, not th
attracted these communities,
1950's, they remained the pr
used higher level language,
these computational needs; d
primitive by comparison with
In fact the name itself, FOR
TRANslator, further indicati
With the reduction in c
of a century, the use of computers
Few will deny that computers have
areas of our society, and antici-
e impact is widespread. Along
of computer usage has changed
can predict further change in the
s were scienti
mani pul ati on
The computatio
eir data handl
and, for the
imary users.
FORTRAN, is cl
ata handling i
currently avai'
TRAN, is short
ng the emphasis
ost of memory,
sts and engineers
capabilities than
nal ability of
ing capacity,
decade of the
The first widely
early oriented to
n FORTRAN is
able te
or FORm
hniques
la
and the develop-
ment of secondary storage devices, the data management
capabilities of computers came to be realized. With thi
the business community became interested in the possibil
-7-
s,
i ty
of using computers to handle the clerical a
operations. Many simple, redundant operati
formed cheaply and accurately on a computer
shifted from complex computations on little
arithmetic operations on vast amounts of da
(COmmon Business Oriented Language), the
level language to be introduced, reflected
computation to data structure needs.
Up to this point, computers were consi
computational tools. People began to reali
certain structured problem areas, computers
spects of
ons could
, and the
data to s
ta. COBOL
next major
this shift
their
be per-
emphasis
impl e
higher
from
dered only
ze that, in
could make
decisions.
could be *g
programmed
decision m
example of
Once
the natura
they could
Before thi
to occur i
were thoug
with them
Any problem having a strict, pro
iven to a computer, once that solu
, and the burden could be removed
aker. The area of inventory contr
such a structured decision being
computers were given decision maki
1 extension was to expand the scop
make into the area of unstructure
s could happen, a subtle but impor
n the view of computers. Previous
ht of as systems unto themselves;
was of secondary importance. Then
cedur
ti on
from
ol is
compu
ng re
e of
d pro
tant
ly, C
man' S
al sol uti on
had been
the human
a prime
ter ized.
sponsi bi 1
decisions
bl ems.
shift had
omputers
interact
it was realized
that the man
system which
and machine
combined the
together could
effectiveness
ewed as a
e man
-8-
i ties,
i on
(intuition and abil
and the efficiency
system or decision
handled. The compu
lation capabilities
natives in his sear
to extract as much
program that, leavi
of structure. The
types of questions
area of "decision s
blossom.
. Just as there
are used for, there
using them. As mer
were the first user
capability to fulfi
users to understanc
formidable beasts a
background. Along
business world was
puter interface.
arena, the user be
user, the interfac
ity to work with insufficient information
of the computer. Through this man/machin
making unit, unstructured problems could b
ter could provide data access and manipu-
to allow the man to explore more alter-
ch for a solution. The central idea was
structure as possible from a problem and
ng the man to handle the problem of lack
program lets the man ask "What if ..
and provides simulated outcomes. The
upport systems" is currently beginning to
has been an evolu
has been a paral
dti one
s T
ll t
and
~nd,
the
in p
As de
came
e had
h
t
s
a
c
h
tion in what computers
lel change in who is
above, scientists and engineers
his was due not only to the computer's
eir needs, but to the ability of these
use them. The first computers were
hus, to use them required a technical
ame lines, the transition into the
rt due to a "softening" of the com-
ision support systems entered the
igher level managers. To support this
to shift more toward the human side.
Much effort has been spent on natural language support,
-9-
graphics, and othe
interface more pal
user of computers
advances as well a
computer's side to
that the major use
casual user at hom
must be pushed eve
r techniqu
atable to
has evolve
s effort t
the human
r of the f
e. 1 For
n further
es which would make the computer
the human user. The use and
d as a result of technological
o move the interface from the
user's side. Codd has predicted
uture (1990's) will be the
this shift to occur, the interface
towards the human side.
Current Problem Areas
Though the impact of
examples can be found of 1
to be careful not to be bi
disaster stories, for it i
that these dismal results
the computer industry woul
has. How many businesses
failure were the likely ou
ters are easier to report
successful applications.
intent to avoid the dramat
that are most frequently e
computers
ess than
ased
s eas:
are t
d not
woul d
tcome
(and
With
ic, w
has been
desirable
y the prolif
to be led t
e norm. If
have enjoyed
continue to
The proble
o many more
his in mind,
great, many
results. One has
ration of
the conclusion
his were true,
the success it
e burned if
i is that disas-
ewsworthy) than
coupled with an
e shall explore the problems
xperienced and seek to find their
causes. An understanding of causes shou
for a solution.
ld help in the search
-10-
user s point of view, the
is of primary
finds is not that software
rather that
systems. G
much better
many cases,
systems, th
region for
ware develo
risk stock,
investment.
software de
there is
iven the
than ano
especial
e softwar
his estim
pment a r
only a p
Those c
vel opmen
likely to be hit th
can go a long way t
success, but an eve
The cost of so
the customer, but i
a close second. Th
from one of two cas
task cheaper than i
provide the ability
done. In the first
t
e
n
f
n
e
e
t
great va
same task
ther, and
ly in the
e custome
ate of li
isky busi
otentiall
ustomers
as a por
hardest.
reduce t
greater
tw
t
b
s;
c
to
ca
basis for comparison,
importance.
products
riabi 1 i
, one i
some m
curren
r must
kely su
ness, a
y high
who do
tfolio
The r
are
ty
mpl
ay
t a
The problem one
universally bad, but
in the success of
ementation may be
never work at all. In
rea of decision support
s
e
0
eputati
his variabilit
reduction coul
are is usually the
he current economic
enefit of using a c
either the compute
ould be done other
do things that cou
se, the cost/benefi
in the second the i
a wide confiden
This makes sof
t as with a hig
will warrant th
w substantial
n are the ones
on of a vendor
y in expected
d be hoped for.
second conce
environment
omputer can
r can perfor
ways, or it
ld not other
t analysis h
ssues are us
ce
t -
h
e
of
it
stem
m a
can
wise
as a
ua 1l
iS
be
y
too nebulous. In either case a cost reduction
-11-
is desirable.
software product
From an end qual ity of a
accept
ccess.
nd, ju
payoff
not vi
decisi
There are actually three costs associated with a software
product, d
Each of th
turns out,
magnitude
necessary
the increa
ment and o
accounted
way down. 3
a result o
a reductio
e v
e s
t
ov
to
se
pe
fo
f
n
elopment cost, maintenance cost
e has two sources, people and e
he cost of hardware has dropped
er the last two decades, while
utilize that hardware has cont
.2 According to Madnick only 3
ration cost of new application
r by the hardware, and that rat
Thus the typically high cost of
personnel costs than hardware c
of personnel requirements could
, and cost of use.
quipment. As it
two orders of
the software cost
inually been on
0% of the develop-
software is
io is still on the
software is more
osts, and therefore
have the greater
impact in terms of cost savings.
Businesses are typically dynamic, growing, evolving
entities, and thus their software needs change over time.
This being the case, software should be designed to allow for
easy refinement. In many cases this is not done, and cus-
tomers often find that starting from scratch is easier than
modifying an existing package, even for seemingly simple
changes. The requirement for flexibility is not only a
function of time varying needs, but also one of initial
specification inadequacies. Frequently a customer will not
fully understand his own needs until he has had a chance to
work with a system. If the system is built without
flexibility for change, then by the time he
-12-
real izes the
inadequacy of the initial specifications, it is too late.
One of the primary needs for software today is in the
area of information storage and retrieval. Many businesses
are coming to need database management systems to handle
vast amounts of data, and the sheer size of these databases
makes the problems discussed above even more pronounced.
Because many consider database management the major bottle-
neck in software development, this will be the subject of
this paper.
The Reason for the Problems
The problems presented above derive from the fact that,
given the tools (higher level languages) most commonly used
to build software packages, it is a substantial jump to
generate a finished system. Put another way, the difference
between the finished product and the basic materials is
quite large.
An analogy to housebuilding may cl
the above assertion. In the early days
grammer had available only the primitiv
machine with which to build programs.
the good old days of the pioneers when
represented the basic materials of a ho
there was a huge gap between the basic
finished product. The next step for th
-13-
arify the meaning of
of computers, a pro-
e instructions of the
This is comparable to
a stack of felled trees
use. In each case
materials and the
e programmer came
with the availability of
COBOL, PL/
to the use
each case,
was reduce
to the use
remain the
community.
argument i
"The Archi
One c
one point
of alterna
distance f
large,
A singl
etc.).
f precut
he gap f
d. While t
of prefabr
primary to
The reade
s highly re
tecture of
an think of
to another.
te routes i
rom the poi
here a
route
re
wi
many
11 1
These languages
boa
rom
rd
th
he ho
icate
ols o
r des
comme
Compl
this
If
s sma
nt of
al te
i kely
s in the build
e materials to
usebuilding in
d units, highe
f the majority
iring further
nded to refer
exity." 4
gap in terms
the distance i
11. If, on th
departure to
rnate routes f
consist of ma
afforded a base comparable
ing of a
the fin
dustry h
r level
of the
support
to Simon
house. In
ished product
as advanced
languages
software
of this
's article,
of driving a car from
s short, the number
e other hand, the
the destination is
rom which to choose.
ny legs or sub-routes.
A large
leads to the
ways.
gap from
problems
basic materials to finished system
of the last section in the following
* Obviously, the personnel costs depend on the time required
to build a system. The more primitive the tools, the
more time required. This can also lead to substantial
lead times necessary to complete a system.
-14-
h igher level languages (FORTRAN,
* The resulting design tends to be complex (many pieces
with many interconnections). The complexity frequently
eliminates the possibility of understanding the entire
design at once. One must concentrate on only a portion
of the design at a time, making it easy to overlook
ramifications of decisions concerning that portion.
Thus bugs are likely to creep into the design.
* The pro
design
choice
system
liferation o
the personal
is made will
(the variabi
f potential paths makes any
choice of the implementer.
greatly affect the quality
lity of success issue).
resulting
How this
of the
* The fact that the path chos
ususally limits understandi
implementor. If he leaves,
be all but impossible. Thi
a frequent phenomenon.
en
ng
a
Si
personal in nature
the design to the
ange to the system can
why starting over is
A Logical View of Data
The arguments presented above lead directly to the con-
clusion that the development of higher level tools is the
solution. If the gap from basic materials to finished pro-
duct is made small enough, the implementor could produce
straightforward, easy to modify, working systems much faster
-15-
and at lower cost than ever before possible. To this end, we
require an understanding of the common functions of all
systems. Having restricted ourselves here to the area of
database management systems, the framework necessary is a
logical view of data or a theory of information. If we can
develop a model of real world information, then that model
would provide the basis or platform from which we could build
information systems.
Over the years, numerous models of data have appeared.
Some differ only in level of sophistication, one being a
subset of another, while others represent different approaches.
The next section shall highlight a few of the major models of
the past and present. The subject of this paper will be a
comparison of two of these views. It is important to emphasize
here that comparisons of this nature can rarely be definitive,
for the issues involved are highly subjective. One can merely
make arguments for or against a model with respect to assump-
tions of what it is that makes a model good or bad. In this
regard, one should always take care to explicitly state the
assumptions underlying the argument, for it is likely that
the assumptions will be the actual basis for agreement or
disagreement.
-16-
The approach taken here to motivate the need for a
logical theory of data is not the traditional one. We
discussed the framework in terms of providing tools to
user. The more typical strategy is to present a model
interface wh
In most case
functions fr
new storage
efficiency o
tampering wi
programs).
tant, but if
any robust m
despite the
tends to ind
ich
s t
om
tec
f a
th
Thi
it
ode
exi
i ca
can clearly
he stated de
the physical
hniques are
system, the
the logical
s benefit of
were the so
1 would suff
stence of se
te that more
separate development
sire is to isolate the
storage functions. T
developed which could
changes can be made w
functions (i.e., user
an interface is extre
le motivation, then ju
ice. The current acti
veral good canditate i
is at stake. The mod
efforts.
logical
hen, as
improve the
i thout
application
mely impor-
st about
ve research,
nterfaces,
el chosen
should not only
structuring, but
which to develop
provide this protecti
should represent the
information systems.
on from the physical
best platform from
History
The first big step on the road to a logical view of data
came with the concept of a simple sequential file. System
designers recognized that a typical data storage pattern
involved storing many items of the same type. Thus the term
record came into being to represent a single item, and a file
-17-
have
the
as an
was a collection of these
poi n
si ze
of t
of a
and
comm
were
t of v
with
h
U
a
i ew,
no me
e user. B
file of re
se a secon
nds. Comma
provided,
the
ani n
eing
cord
dary
nds
and
s
to
th
cord w
the i
ble to
was qu
torage
read
e gap
product was significantly
Soon demands became t
simple sequential file, an
developed. It was recogni
access records by some mea
record number, so indexing
to handle the need effici
tication, the underlying
of like records, remained
simple, has enjoyed great
that it is still the most
as simply
nterpreta
a s
ti on
tring of
was the
ual ly
spons
think about data storage
ite an advance over having
device with only its basi
and write logical records
from basic materials to fi
reduced.
oo complex to be
d random accessi
zed that a user
ningful name as
and hash-coding
ently. Even wi
model of data,
the same. Thi
success as att
commonly used
fixed
i bi l i ty
in terms
to format
c I/0
from a file
nished
handled by a
ng techniques were
would like to
opposed to a logical
techniques arrived
the added sophis-
single collection
model , though fairly
buted by the fact
del of the software
commun i ty.
To this point, the contents of
to the computer system. Soon ideas
standard strategies for interpreting
and relationships between them. The
information which would support the
-18-
records were meaningless
developed concerning
the contents of records
need for a theory of
ability to interpret the
From the computer'slike records.
contents of records was soon felt. Over the
models have appeared in the literature. The
most of these models fall into one of two ge
network or relational. One finds that this
delineates two camps, and there is an active
literature between them.
The CODASYL Data Base Task Group has be
advocating the network model. In 1962 they
developing, along with a standard model, a p
standard Data Definition Language (DDL) and
years quite a
undercutrrents
neral catergor
dichotomy clea
debate in the
few
of
i es
rly
en the leader in
began the task of
roposal for a
Data Manipulation
uage (DML)
Along th
. They are still actively working
e way, many successful systems have
toward that
been built
on the network model.
In 1969, E.F. Codd of IBM Research in San Jose began
advocating the relational model.5 Its underlying goals are
similar to those of the network model, but the approach is
different.
We have not explicitly mentioned the hierarchical view
of data, despite its importance as a basis for some widely
used data management facilities (such as IBM's IMS). The
hierarchical model can be viewed as a subset of the network
model and has been found to be inadequate in representing
many real world information structures. For these reasons,
we will not consider the hierarchical model, even though it
does represent substantial current usage.
-19-
Lang
end.
The Current Conflict
As mentioned above, the debate between networks and
relational advocates is currently quite active. This paper
will attempt to provide a framework for the comparison of the
two models, hopefully putting many of the current arguments
into perspective.
Because any comparison of this type is highly subjective,
there are many potential pitfalls to be avoided. A description
of several of these follows:
* Because there i
of each model,
one assumes to
clear resolutio
begins with are
of the results
paper attempts
First, we begin
bases of the tw
accuracy of thi
we do not penal
either model is
finds a way to
le universally accepted definition
a high degree of latitute in what
s n
the
b e the components
n
th
to
0
s
iz
f
r E
of each. There is
to this problem, and the assumptions one
ikely to have more impact on the accepta
an the process of reaching them. This
deal with this problem in two ways.
ith the author's reading of the accepted
models. The determination of the
reading is left to the reader. Second,
e either model with this reading. If
ound to be deficient, and the author
medy the problem without altering
nce
the
underlying structure of the
-20-
model, then a change is not
ruled out. Essentially we are giving
benefit of the doubt, and using the 1
above, hopefully, to make the compari
each model the
atitude mentioned
son more meaningful.
* Due to the greater
have been built on
that
mode
one
the
is
odel
greater numbe
an indication
inherently 1
tation, that m
enough time ha
other side of
targets for hi
easy to equate
with problems
This should be
implementation
accompanied by
age of the network model
it. A network advocate
r of
of i
eads
ight be a val
s not passed
the coin, the
s attacks on
problems in
in the model
avoided at all
as a focal poi
a clear analys
systems buil
ts greater s
to a more ef
id point of
to determine
relational
the network
a particular
underlying t
t on
ui tab
fi cie
conte
that
advoc
model
impl
he im
st, and any
for compari
separating
, more systems
could argue
the network
ility. If
nt implemen-
ntion, but
. From the
ate has more
. It is
ementation
plementation.
use of an
son must be
the problems
due to the model from those due to the implementation.
By and large, this paper avoids specific implementations
as bases for comparison of the two models.
-21-
* The re
mati ca
ati onal
theory,
model revolves
whereas it is
around an
difficult
abstract mathe-
to find precisely
stated mathematical
This lea
issues h
and the
basis fo
meaningl
to some
terms of
abstract
ds th
ave n
rel at
r the
ess.
exten
the
e netwo
ot been
ional a
networ
The re
t, by a
rel atio
mathemati cal
descriptions
rk advocate
considered
dvocate to a
k model exis
lational arg
proposal fo
nal model.
background,
tion will be non-rigorous. It is
a stronger mathematical bent will
ther. It might be suggested, as
for the absence of a mathematical
work theory may be due to a lack
possible result of successful imp
of the network model.
) argue that
i the relatio
gue that no m
s. These arg
ment will be
a network th
he author cla
and hence th
hoped that s
pursue this
an aside, tha
formulation
of felt need,
lementations.
* The network mode
implementation s
the model actual
relational model
1 of data lead
trategy. It i
ly grew out of
, on the other
directly to a
quite possible
this strategy.
hand, the most
simple
that
For the
straight-
forward impl
ineffi cient.
the network
ementat
This
model,
ion
coul
but
strategy woul
d be offered
it is not an
d be horribly
as an advantage of
important point
-22-
cal
del
ti cal
are
red,
n
nimal
en ta -
practi
nal mo
athema
uments
counte
eory i
ims mi
e pres
omeone
course
t the
of the
the
with
fur-
reason
net-
because the implementation is done, hopefully, only once.
It is interesting to note that this same fact has been
used by relational advocates to argue that the network
model is not a logical model of information but rather
is a technique for organizing the storage of physical
records. This paper will present the network model as a
logical model of real world information, and the fact
that it leads to straightforward implementation will not
be counted against it.
Presentation Strategy
The reader may detect a slant in this paper in favor of
the network view. Though the author does not deny a certain
tendency in that direction, the purpose of the paper is to
present a more objective framework for comparison. In each
point of comparison, an attempt was made to present the
arguments as fairly and with as little bias as possible. The
goal has been to find the better of the two models, if possible,
and not to merely defend one model on an emotional basis. The
slant is intended, in part, to overcome any subjective endoc-
trination the reader may have absorbed from the relational
literature.
The paper can be divided into four parts.
-23-
1. Background (Chapter 2 and 3)
The network and relational models will be
to the extent necessary for our purposes.
general reading and the author's modifica
be discussed.
2. Comparison (Chapter 4)
Five points of view with respect to a dat
management system will be presented with
of the two models in terms of each.
3. Mathematical development of the Network T
(Chapter 5 and 6)
To put the network model into a more theo
framework than it has had in the past, a n
algebra and a network calculus will be de
(a la Codd).
4. "Summary (Chapter 7)
A recap of the major results will be pres
supportability of each view by the other
be discussed. The potential benefits of
view will also be given. Finally, topics
further research which have been generat
be itemized.
presented
Both the
tions will
abase
a comparison
heory
reti cal
etwork
vel oped
en ted.
model
a hybr
for
ed here
The
will
id
will
-24-
Chapter 2 The Network Model of Data
Introduction
As mentioned before, many database management systems
have been built on the network model of data. Looking,
however, at some of these individual systems, it is not clear
that there is a single model underlying them all. Each
designer seems to have his own interpretation or variation of
the central theme of the network model, and this makes difficult
the task of explaininq "the" network model, The model set
forth here represents the author's understanding of the central
concepts of the network model, and thus may be at odds with
other expositions. The bulk of the ideas and terms are those
presented by Bachman and his articles should be referenced
for further clarifications.6 ,7 ,8
Data versus Information
Before plunging into a discussion of the terminology and
concepts of the network model, it may be profitable to clarify
the difference between the terms data and information. A
data element is simply a value; for example, the number 27.
By itself it has no meaning. Information is interpreted data,
or an association. If 27 is the number of people in a par-
ticular class, then the data element 27 has taken on meaning,
and hence informational content, by virtue of its interpretation.
-25-
The function of a database
storage of vast amounts of
is rather the storage of i
systems revolve around the
the data they store.
management system is not just the
data as the name might imply; it
nformation. Database management
concept of an interpretation of
Entities, Attributes, and Keys
All information exists with respect to objects
which shall be termed entities. Any concrete or ab
object can be an entity. For example, a person, a
color, and an idea are all entities. In a database
system, entities are the things about which we wish
information. We shall call the information we wish
or things
stract
state, a
management
to store
to store
about an entity its
attribute will repr
data. For instance
Tom Smith is 32. T
entity Tom Smith.
the name attribute
The name attri
special role. We u
the entity itself.
attributes, and the name
esent the interpretation
, we may wish to remember
hen the value 32 is the a
In fact, "Tom Smith" itse
of a particular person en
bute, as we have been usi
sed the name Tom Smith tc
Whenever an attribute o
s we gi
of the
that t
ge attr
lf is a
tity.
ng it,
actual
r a grou
attributes uniquely identifies an en
group of attributes is called a key.
in a database management system must
tity, that attrib
Every entity re
have a key, even
ute or
presented
if it is
-26-
ted
of
f
bute,
ve each
associ a
he age
ibute o
n attri
plays a
ly mean
p of
composed of all the attributes of the entity. Likewise, any
entity may have many keys. If this is the case, one is
arbitrarily chosen and deemed the primary key for use by
the system.
Entity Classes
As mentioned above, any concrete or abstract object can
be an entity. Two examples are the person Tom Smith and the
color red. Even though these are both entities, it is useful
to distinguish between them because the sets of attributes
describing them are different. Tom Smith, for example, would
not have the attribute wavelength, just as the color red would
not have a social security number. We thus define an entity
class to consist of all entities described by the same set of
attributes. There might be a person entity class with the
attributes name, age, and social security number, and a color
entity class with attributes color name and wavelength. We
thus have a scheme to classify all entities according to the
attributes that describe them. In some cases, the classification
of entities may be a matter of j
stances. For example, men and w
the same attributes. In one sys
to have one entity class for bot
sex. In another, it may be pref
class for each. A school
udgement, depend
omen are largely
tem it may be mo
h, along with an
erable to define
database would most
-27-
i ng
de
re
at
an
on circum-
cribed by
rofi table
ri bute
entity
likely combine men
and women into a single class whereas a medical database may
benefit by splitting them. The trade-offs of the application
will determine the appropriate dichotomy.
Associations Between Entities
To this point we have described a database to consist of
a group of unrelated entity classes. Each entity class is
separate and maintains the values of the various attributes
associated with the entities in a class. The one additional
type of information we might want to record is the association
between entities. Some typical examples are the associations
between:
* husband and wife
* teacher and student
* father and son
* company and employee
* assembly part and component part
Thus, instead of associations between an entity and an inter-
preted value, each side of the association is an entity. It
is the handling of this type of association that clearly
differentiates the network from the relational model. Any
other characteristics of one model, if beneficial, could easily
be incorporated into the other. It is, therefore, in this
-28-
area only that valid comp
Before describing th
associations between enti
istics of such associatio
number of entities which
arisons can be made.
e network approach to managing
ties,-let us explore the character-
ns. As a first step, what is the
may participate in an association?
In each o
but this
wish to a
associati
f the
may no
ssoci a
on may
examples given the
t always be the ca
te a mother, fathe
be of any degree,
re were only tw
se. For exampl
r, and child.
or there may b
entities,
we may
us an
any number
taking part in it.
In some cases, a
the role of an entity
understood based on i
the father/son associ
association will come
an entity is then not
associated in the fat
is the father and whi
we will assign role n
association. We can
relating two or more
respect to the associ
various entities come
we have not ruled out
being the same, as in
s in the company/empl
in an association wi
ts entity class. In
ation, two or more en
from the same entity
clear. If Tom Smith
her/son association,
ch i
ames
thus
enti
atio
fro
the
the
oyee a
11 be
other
ti ties
class
and J
it is
ssoci at
clearly
cases
in an
. The
ohn Smi
uncl ear
ion,
as in
rol
th
wh
of
re
ch
s the son. Thus, for each association,
to each part played in the
think of a single association as
ties, each in a certain role with
n, with no restriction that the
m distinct entity classes. Note that
possibility of two or more roles
case of associations between brothers.
-29-
of entities
When an association nas two identical roles, they are said to
be symmetric; if A is the brother of B, then B is the
brother of A.
We have been treating an association as a single instance
of an association between specific entities. It now becomes
beneficial to differentiate between an instance of an
association and as association class comprising all instances
of a particular association type. For example, all instances
of associations between fathers and sons comprise the father/
son association class, just as all instances of person entities
made up the pers
Now, consid
moment, how many
same entity in o
association, ass
only appear in o
same is true of
association may
or a single woma
a single man may
association, and
association is t
company/employee
on enti ty
er ing
i nsta
ne rol
uming
ne ins
a wome
have a
n enti
take
the s
hus te
assoc
work for one company)
only
n ce s
class.
binary association classes for the
of an association class may have the
e. In the example of a husband/wife
a monogamous society, a single man may
tance of the association. Clearly the
n. Thus only one instance of an
single man entity in the husband role
ty in the wife role. Put another way,
part in only one instance of the
ame is true for a single woman. This
rmed one to one (1:1). Now, in the
iation (assuming a person may only
one company may occur in many instances
of the association, but each employee
-30-
may occur in only one
instance.
Moving on
This is ca
to the teac
lied a one to
her/student ca
many (1:n)
se, one te
associat
acher may
ma ny
asso
and
is a
may
students
ci ati on.
hence be f
many to m
be one to
The netwo
and
Like
ound
any
one,
rk m
henc
wi se
in
(m-,n
one
odel
provides directly f
associations only.
sounds. Take first
binary associations
higher degree. For
two binary associat
the same purpose.
one can derive the
As far as many to m
can be handled via
to many association
classes to be (m:n)
be
on
any
as
to
of
or the
This
the a
can a
found in
e student
instances
soci ati on.
many, or m
data, as e
storage of
is actually
ssociation
lways repre
example, in
ions, father/
By operations
associations
a
t
ny associati
he creation
is defined b
associated,
the
chi
on
maany
can
of
Th
any
xpre
one
not
degr
instances
have many
the associ
us binary
to many.
ssed in th
to many b
as restri
ee issue.
sent an
fathe
ld and
these
associ
of
teac
atio
asso
the
hers
n. This
ciations
iterature,
ry
ve as it
set of
on of
r/mother/child case,
mother/child, serve
two associations,
between mothers and fathers.
ons are
of a ne
etween
and the
operations to be defined shortly. This
entity class provides the needed many
capability. Unfortunately, there is n
to many association to handle the one
sense, a one to one association can be
-31-
concerned, they too
w entity class. A one
each of the two entity
new entity class via
"cross reference"
o many association
way to restrict a one
o one case. In one
considered a subset of
ion
h a
e l
ina
cti
A
ati
the o
But t
one r
omi ss
data
purpo
vi des
the b
ne
he
est
ion
mod
ses
on
as i
to many case, where many in
object would be to have the
riction, not the user. The
of one to one associations
el and hence its inclusion w
of this paper. In summary,
e to one and one to many bin
c capabilities necessary to
world associations.
handl
model
arti f
many
Many may
e many to
, and the
icial
to ma
concern. Con
is not associ
part of that
reference ent
as an artific
given explici
associations,
Thus the cros
enti
pupi
ties,
1/cla
consider the omi
many relations
use of a cross
ol utio
assoc
There i
tions whi
sider the s
ated with a
teacher's d
ity class,
ial means t
tly only a
may take o
s reference
would
ss enti
represen
ty is as
tudent/t
"whol e"
ay, the
which to
o handle
facility
n a real
entity
t the pu
sociated
ion of a d
result in
ference en
i rec
an
tity
however, a way
may help to ove
eacher example.
teacher, but r
class period.
this point has
many to many a
to handle one
logical meanin
between teacher
pil/class entit
with one stude
facility to
ncomplete
class to be an
to look at
rcome this
Any student
ather with a
Thus the cross
been described
ssociations
to many
a unto itself.
y.
nt
and
Ea
and
student
ch
one
teacher. Any student may be associated
-32-
with many pupil/class
each instance is one.
system enforce the one to
author contends that the
results in an incomplete
ill be assumed for the
a data model which pro-
ary associations possesses
handle all meaningful real
Any time there is a many to
many association, there is an underlying implication that one
of the entities on either side cannot be associated with
"all of" any entity on the other side.
The one to many binary association can be thought of as
directed because there is an asymmetry between the two roles
of the association. One role can have any entity only once in
all instances of the association class, whereas the other role
may have the same entity appearing in many instances. To
differentiate the two roles, the role in which an entity may
appear only once is called the member of the association and
the other is called the owner_. For example, the company
would play the role of owner with respect to the association
with member employees. These terms can apply equally to the
entity classes related by the associations (if distinct), or
to individual entities which constitute an instance of the
association. The reader should take care that conotations
of these terms do not interfere with the analysis of database
structure. The fact that companies do not "own" individuals
does not mean that their database counterparts should not.
Owner and member are purely technical terms to differentiate
the two roles in a one to many binary association.
The term key has a special meaning in connection with
associations in the network model, and we must revise our
former definition slightly. As used previously, keys served
-33-
entities, as may any teacher.
to un
used
and i
quely
n the
enti f
identify e
network mo
ies members
association is ident
member entity class,
following associatio
to direct the search
through associations
of entity classes, t
those which must be
classes. Some syste
fashion by creating
accessible entity cl
the root entity.
the root entity an
associations. In
keys in a path uni
the path. Thus th
within an associat
to them previously
entity class. Thi
cause problems. I
two distinct paths
tifications of an
Th
d
th
qu
is
io
f
f
if
a
ns
t
ho
ac
ms
a
as
us
fo
ntities in an entity class. As com
del, a key is an attribute which or
of an association. Thus each
ied with a key or attribute in its
nd all database accesses are made b
to the desired entity, using the k
This access pattern in terms of pa
ends to differentiate between two t
se which are directly accessible an
cessed as members of other entity
manage these two types in the same
special root entity. Each directly
s is the member of an association w
all accesses in the database begin
llow a path through entity classes
is scheme, it i
ely identifies
revised defini
n class serves
uniquely identi
combination of
a particular en
s clear
the enti
tion of
the same
fying en
function
tity can
then there can be redun
entity. Sometimes
that the set o
ty at the end
keys as identi
function ascr
tities in an
s, however, do
be accessed v
dant unique id
mon ly
de rs
y
eys
ths
ypes
d
i th
at
via
f
of
fiers
i bed
es
ia
en-
one or more of the keys
is allowed to be non-unique to handle this problem.
-34-
This
also can lead to
non-unique path.
ordering members
problems in creating and deleting using
The author feels that the functions of
and uniquely identifying entities should
clearly divided
association" be
the members of
entity in an en
of the set of a
It is sugges
used to mean t
the association
tity class shou
ttributes and a
ted tha
he attr
Uniq
Id then
ssoci at
uniqueness. This set will be called
or primary key of the entity class.
"key" has developed in the literatur
but the appropriate qualification or
the function intended. The reader s
own mind the two uses of the term an
which is being used at any point.
What are the basic functions re
associations? Given any owner entit
association, we should be able to fi
most systems, when requiring all mem
"piped" facility to present the memb
of the key of the association. Give
this facility allows us to retrieve
the next
associati
member.
on, we
e
h
d
t the term "ke
ibute(s) which
ue identificat
be expressed
ions which gua
the identifica
Unfortunately,
to handle bot
context should
ould keep clea
be careful to
y of an
orders
ion of an
in terms
rantee
tion key
the word
h issues,
make clear
r in his
understand
quired to manipulate
y with respect to an
nd any or all members. In
bers, there is usually a
ers one at a time in order
n the owner and a member,
the previous member or
Also, given any member entity of an
should be able to find the owner.
-35-
the
be
Standard Terminology
In the network model of a database
there is a file for each entity class.
file corresponds to one entity of the cl
of entities are stored in fields on each
associations between entity classes are
between the owner and member files.
management system,
Each record in the
ass. The attributes
record. The
directed relations
Data Structure Diagrams
A data structure diagram is a technique for depicting
graphically the files and relational structure of a database
The technique is quite simple, using only two symbols, a box
and an arrow. A box represents a file or entity class. There
is one box for each file in the database. An arrow represents
a relation between two files, pointing from the owner file to
the member file. With these simple tools we can represent
any complex database structure.
For example, the company/employee relation discussed
above would be depicted:
company
emplee
Figure
-36-
The relation (arrow) is frequently thought of as a chain.
Given any company we can go down the chain to find a partic-
ular employee; we can find the previous or next employee in
the chain for this company or we can find the company.
It is common practice, when possible, to place owner
files vertically above their corresponding member files.
Trees and Networks
Using data structure diagrams, we can explore the impli-
cations of various database
In the
one owner e
member role
most relati
structure.
to describe
teacher/stu
to make the
This is not
A data
by more tha
illustrate
several cla
mpany
ty orn
/employee
only one
structures.
database,
relation i
each st
n which
When Prh file in A da e
uden
it
t had
played
only
the
i s memehr of at
on, that database is said to have a s
Simple tree structures are frequentl
a real world information structure.
dent database into a tree structure,
assumption that each student has onl
commonly the case.
base has a network structure when any
n one relation. The teacher/student
a
s
netw
ses
ork.
us ual
model this information
adequate.
During
ly each
hool
a di
imple tree
y inadequate
To fit the
we would have
y one teacher.
file is
database
day, a student has
fferent teacher.
owned
wi 11
To
structure, the following diagram
-37-
teacher student
Figure 2
Each record in the pupil file represents a class period for
a particular
be class nam
student, we
associated t
teacher, we
associated s
All the
in the struc
possible to
in a manufac
Each part is
(it is an as
assem
files
that
a fil
student. Thus the
e, period number, a
could chain through
eacher (owner) reco
could chain through
tudent.
structure discusse
ture of the files.
store structure in
turing environment
either a detail pa
sembly). The parts
ies themse
hould not
ructure.
to record
1 ves
have
All
each
As the
to be cr
we really
instance
e
fields of the pupil
nd class room. Given
the pupil
rd.
the
Li kewi
pupil
d to thi
In a sp
the data
there mi
rt or is
of an a
product
ated or
need bes
of a use
file, finding
se, g
file,
s point
ecial ne
itself.
ght exis
made up
ssembly
structur
del eted
ide the
of one
iv en
fin
an
din
has e
twork
For
t a p
of o
might
e var
to ma
part
part
file might
any
the
the
xi sted
it is
example
art file
ther par-
be
ies, new
i n ta in
file is
in the
assembly of another.
file is related
Each record in this product strucutre
to two part records,
-38-
the assembly part and
the component part. Each part record may have many components
or be used in many assemblies. The following file structure
brings this together.
ProductStructure
Figure 3
Given any assembly part, we can travel down the component
chain. At each component record we find its owner via the
other chain (the component part record) and check it for
components. This double chaining between two files allows
us to store a network structure in the data itself.
To diagram the one to one relation, which we said would
be considered part of the network model for our purposes, we
need use only an arc instead of an arrow (directed arc). The
husband/wife database might appear as below.
Men Women
Figure 4
-39-
The use of an arrow to indicate the direction of a one
to many relation has been, on occasion, a source of confusion,
for frequently it will be thought of as a pointer. This
results in the misconception that it is only possible to go
from an owner to its members via the relation. In the dis-
cussion of basic functions we noted that one requirement was
to be able to find the owner of an association given any
member. Thus is a sense we can traverse the arrow in either
direction; the arrow merely distinguishes the two sides of the
rel ati on.
Relation Naming
In most of the diagrams given, we have not explicitly
labelled
the namin
crucial.
need two
case wher
classes,
the ambig
rel ati on
Cons
Figure 3
the relations. This does not imp
g of relations is irrelevant. I
In fact, as we shall see, a rel
names, depending on the directio
e there is a single relation con
the need for explicit naming is
uity question does not arise. I
naming can be ignored.
ider the part/product structure
. Given any part record, we may
ly, however,
i many cases
ation may act
n travelled.
necting two e
reduced, for
n a strict hi
databa
want
that
it is
ual ly
In the
ntity
then
erarchy,
cted
all
components
part is us
used to ma
ed in. Whi
that
task
part or
we want
all the assemblies that
to accomplish will
-40-
determine the appropriate path to fol low.
request to the syst
denotes its functio
would be "component
find all members of
components used in
relation to find th
let us consider tha
record and we want
corresponds to. Un
that connects an as
structure records.
a product struc
relation. The
meaningful for
be meaningful w
is that a relat
looking "down"
be used when lo
tu
th
a
he
io
t
ok
em, each path should be
n. In this example, tw
" and "assembly". For
the component relation
the part, or all member
e assemblies the part i
t we have a certain pro
to find the assembly pa
fortunately,
sembly
Thus,
re we mu
rust of
rel ati on
n looked
n should
he relat
ing "up"
To communicate the
given a name which
likely names
ny part we would
to get the
of the assembly
used in. Now
luct structure
't number it
it is the compon
part to its
to get the
st find the
this argumen
when looked
at from the
have two na
ion from the
the relatio
componel
assembly
owner vi
t is tha
at from
other.
mes, one
owner s
n from t
pa
a t
t a
on
Th
to
ide
he
ent relation
product
rt number of
he component
name which is
e side may not
e implication
be used when
, and one to
member side.
The naming for the part/product structure case might be:
-41-
Chapter 3 The Relational Model of Data
Introduction
The relational
theory of relations.
has been the effort
in San Jose, Califor
prolific) proponent.
given here is almost
Codd.9 ,10 ,1 1 ,12
model
The
of ma
nia,
The
excl
of data i
appl i cati
ny, but E.
is by far
descripti
usively ba
s
on
F.
th
on
se
based on mathematical set
to information theory
Codd of IBM research
e leading (and most
of the relational model
d on the work of
The term "relation", unfortuna
different meaning in the relational
model. This is a continual source
discuss or compare the two theories
is so firmly and centrally rooted i
simply be careful to interpret it i
which it applies.
ly, has an entirely
odel than in the network
confusion in attempts to
but each use of the term
each theory that one must
terms of the theory to
Goals
The relational theory was developed to overcome three
problems noted in many systems. Codd firmly asserts that
relational model is not a response to the network model, b
the first major paper discussess the problems in terms of
shortcomings of the network model, among others. 13
Whether these three problems were inherent in the network
-43-
the
ut
the
theory or merely outcomes of particular implementations is a
subject for later discussion. The three problems are those of
ordering dependence, indexing dependence, and access path
dependence.
Ordering dependence means that files and relations in
the network model are stored in a particular order of the
keys (usually the alphanumeric collating sequence). For
instance, given a company record and requesting all the
employees of the company, a system would usually be designed
to present them in a predetermined sequence, for instance
by employee number. It
would be written expecti
hus, if the stored order
be the case, the applica
properly. The reason re
rk based systems is one
almost always request t
This has been observed
in system operations whi
ly if ordering exists in
pplication programs shou
require when requesting
was argued th
ng
we
tio
1 at
of
he
in
c h
a
ld
that order
re changed
n program
ions are o
efficiency
same order
practice.
can be per
relation.
specify th
members of a particula
at application
(depending on
over time as
would no longer
rdered in almost
First, the
for a particular
Second, there
formed more
As an aside,
e order they
ir relation.
If the order requested matches the system orderi
specification may be ignored; otherwise the syst
expected to reorder (sort) the members prior to
-44-
ng, the
em can be
presentation.
in
pro
it)
may
fun
all
use
rel
are
eff
per
order
grams
and t
wel 1
cti on
netwo
r will
ation.
certa
i ci ent
haps a
Thus ordering can represent a potential cos
possible consequences should be understood
To return to the mainstream of the discussi
dispensed with in the relational model so t
programs may not be designed which exhibit
dencies.
Indexin
implementati
Indexing is
record given
record given
a logically
the applicat
some systems
reference an
for its exis
the logical
indexing is
cannot be sa
Access
t savings but its
and planned for.
on, ordering is
hat application
ordering depen-
g dependence is purely an efficiency issue in
on and has no bearing on a logical view of data.
a technique for rapidly finding a particular
its key. The logical function is to find a
its key. Whether this is done via indexing or by
equivalent linear search should be irrelevant to
ion program. Unfortunately, as Codd points out,
such as IDS require the application program to
index b
tence an
network
not tran
id to su
y name
d use
model
sparen
pport
path dependence
network model, but its
generally expressed in
to look at the network
rather
t.14
nd any
to th
he log
is a
rami fi
the li
model,
cati
tera
as
than have th
Indexing has
impl ementati
application
cal network
ruly inheren
ns are not q
ure. There
logical mod
e system check
no place in
on in which
program
model.
t aspect of th
uite those
are two ways
el of real wor
e
ld
information, and as an implemen
argue against the network model
-45-
tation strategy. Those that
in terms of access path depen-
dence usually
they present
the data and
depend on whi
changed, the
Treating the
the problems
treat it as an
several possible
thus demonstrate
ch structuring i
program will no
network model as
are always dimin
imple men tation s
structurings in
that an applica
s chosen. If th
longer function
a logical view
ished if not alt
trategy. Thus
which to store
tion program will
e structure is
properly.
of information,
ogether removed.
If the logical view is followed, there is one and only
structure
ships betwA
constant,
switching
hand, the
database s
is to rema
which accurately reflec
!een entities. Thus if
so is the network model
possible structurings i
structure of the real N
tructure must be altere
iin an accurate represen
case, the only meaningful
ts the rea
the real w
, and the
s banished
orld chang
.d to refle
tation of
one, application
1 world relation-
orld structure is
concern over
. If, on the other
es, then the
ct the change if it
the world. In this
programs may, in
fact, require modifications. Thus the relational model
proposed as a scheme to maintain the invariance of appl
programs as the database evolves over time.
is
i cation
The Model
The relational model is based upon abstract set theory.
It assumes a collection of pools of values called domains.
For example the sets of all possible names or colors or ages
or part numbers are all domains. A relation on domains Dl,
-46-
one
D2, D3, .
whose first
from d
rel ati
model ;
o ma i n
on, t
each
, D n i
el emen
D2
hen
do
is
in
n-tuple corresponds
defined to be a set of n-tuple
comes from domain Di , second e
and n'" elem
the counterpa
corresponds t
to a record.
ent
rt
o a
0
n
Ea
comes from Dn.
f a file in th
attribute, an
ch domain is u
s, each of
lement comes
A
e network
d each
sual ly
given a name
is no longer
than once in
tinguish it f
active domain
appears in re
relation are
which uniquel
A candidate k
key are
is more
termed
super
than
the pr
identify so that the ordering of domains
necessary. If the same do
a relation, each is given
rom the others. At any po
refers to the subset of a
lations in the database.
distinct. A subset of dom
y identifies a tuple is ca
ey is non-redundant if non
fluous in uniquely identif
one non-redundant candidat
imary key of the relation.
domains in a relation
ma
a
in
d
Al
ai
11
e
yi
e
acts as the primary key
in appears more
role name to dis-
t in time, the term
omain which actually
1 tuples in a
ns in a relation
ed a candidate key.
of the domains in the
ng a tuple. If there
key, one is chosen and
If a domain or set of
in another
relation, it is called a foreign key.
First Normal Form
Up to this point we have made no restriction on the
domains in a relation. The elements of a domain may in fact
be relations themselves. For example, an employee may have
-47-
-
as a domain
relation on
and salary.
a non-simple
form if it i
Codd ha
his salary
the domain
Any domai
domain.
s free of
s defined
hi story.
date (of
which is
relation
on-simple
normaliza
This could itself be a
hiring), date (of termination),
itself a relation is termed
is said to be in first normal
domains.
tion process to reduce a
relation containing non-s
in first normal form. 1 5
domain a, then a relation
the domain a of each tupl
in A which is made up of
of s. Once A is created,
from R. Iterations of th
domains remain results in
Viewed in terms of the da
model, the attributes of
imple domains to a set of relations
If relation R has a non-simple
A is created. For each tuple s in
e r in relation R, a tuple t is put
the primary key of r and the domains
the non-simple domain a is removed
is process until no non-simple
a database in first normal form.
ta structure diagrams of the network
the primary key of any owner file are
merged with the attributes o
the relations are removed.
between a department and the
ment.
each of its member files
or example, consider a re
employees who work in the
, and
1 ati on
depart-
-48-
Network
Department
Employee
File Attributes
Department
Department # (key)Department manager #Department name
Employee
Employee # (key)Employee nameEmployee age
First NormalRelation Domains
Department
Department # (key)Department manager #Department name
Employee
Department # (key)Employee # (key)Employee nameEmployee age
Figure 6
-49-
Operations on Relations
Frequently it is necessary to perform operations on a
relation or a set of relations to extract the information
required by a query. These operations can be broken down into
those which operate on a single relation and those which
operate on more than one relation. Some are commonly found
in set theory and others are specifically devised for the
needs of a database system. The description of each below is
primarily taken from Codd. 1 6 , 1 7
I. Singl
1.
e Rela
Projec
This i
from a
(permu
tion o
betwee
a tupl
is the
of A.
genera
identi
tion Opera
ti on
s commonly
relation.
tation) of
f domains.
n 1 and th
e from R,
th domai
Each tupl
te R[A].
cal tuples
removal of
relation.
ti ons
e
t
n
e
an e
It
doma
If
deg
hen
of
of
emem
so
redundant
xtraction of certain domains
provides also for reordering
ins in a relation and a repeti-
A is a list of integers, each
ree of a relation R, and r is
r[A] is a tuple whose ith domain
r where j is the i th element
R is used, under the list A, to
ber that a relation may not have
projection may require the
tuples from the resulting
-50-
2. Restriction
This is a means of selectively removing tuples from
a relation based on a given test function. The
allowable test functions are the standard value
comparison tests <, <, =, /, >, >. A restriction on
R between domains A and B based on the test 0
can be expressed:
R[A 6 B] = {r:reR A (r[A] e r[B])}
Of course, A and B must be comparable types of value
i.e., both numbers or both character strings.
Multi-Relation Operations
1. Union Intersection Difference
These standard set operations apply
compatible relations, or relations
same domains. They are defined in
RJS {r:(rER V reS)}
R(\S {r:(rER A rcS)}
R - S E {r:(reR A rjS)}
2. Cartesian Product
This is also a standard set theory
R S _ {r s:(rER A sES)}
(r s means the concatenation of
r with tuple s)
only to union
defined on the
the standard way.
definition.
tuple
-51-
II.
s,
3. Join
If 6 is one of the
then the 6-join of
domains A (in R) an
R[A 6 B]S = {(rAs):
seS A (r[A] 6
The most common use
equals operation.
identical domains i
is removed, the res
It is clear from th
a basic operation b
of the cartesian pr
4. Division
This is an inverse
product since
(R )S) S =
Division is an extr
stand, and because
it is suggested
familiar with i
formal definiti
close as possib
Assume T is a r
two sets of dom
tha
t be
on,
le t
el at
ains
comparison operators <,<,=, ,>,>,
two relations R and S on the
d B (in s) is defined.
rER A
s[B])}
of a join occurs when 6 is the
This equi-join results in two
n the resulting relation. If one
ult is termed the natural join.
e definition that the join is not
ecause it is expressible in terms
oduct and restriction.
operation to the cartesian
R
emely complex operation to un
it will be of important use l
t the reader become thoroughl
fore proceeding. Before givi
we will describe it in terms
o the use it will serve later
ion which can be partitioned
such that the second set is
der-
ater,
y
ng a
as
into
union
-52-
compatible
seeking to
first set
tuples r s
appears in
T = R
where W ca
impossible
For the fo
"Suppose T
x under T
gT(x)Consider t
degree m b
domain ide
and let
is comp
example
then A
were a
domains
tupl e
and we
A
with a relation S. R
derive will have the
of domains of T. R wi
uch that, for all tupl
T. If R is thus deri
®7 S U W
n be thought of as the
to express any subset
rmal definition,
is a binary relation.
is defined by
= {y:(xy)cT}.
he question of dividin
y a relation S o
ntifying list (w
denote the domai
lementary
, if the
= (1,3,4)
binary re
A, A in
rER, we c
note that
to A and i
degree m of
. We treat
lation with
that order.
an speak of
this is a
a relation w
same domains a
11 consist of
es s of S, r s
ved, then
rema
of W
inder.
as X
It i
D S .
The image set of
g a relation
degree n. Let
thout repetition
-identifying lis
n ascending orde
R were 5 and A
the dividend R
the (possibly c
Accordingly, g
the image set g
subset of R[A].
A be a
s) for
t that
r. Fo
= (2,5
as if
ompoun
iven a
R(r[A]
"Providing R[A] and S[B] are
division of R on A by S on B
union-compatible,
is defined by
-53-
are
the
11
R of
r
it
d)
ny
),3
the
R[A+B]S =
Note that
even if S
Division
describab
product,
R[C+D]S =
{r[A]
when
is al
is not
le in
and di
R[C]
; rER A S[B]
R is empty,
so empty." 18
a primitive
terms of the
fference ope
- ((R[C] )
< g R(r[A])}.
R di vi ded by
operation
projection
rations.
S[D]) - R)
S is empty,
that it is
cartesian
- 19[C]
Second and Third Normal Form
A formal description of and motiva
third normal form can be found in Codd'
Normalization of the Relational Databas
"Normalized Data Base Structure: A Bri
We shall content ourselves here with a
and third normal form in terms of
tion for second and
s articles, "Further
e Model," and
ef Tutorial."20 ,21
description of second
the network model.
As described prev
work data structure di
normal form consisted
record with all of its
relations are removed,
normal form are produc
these inferred primary
primary key of the rel
The answer is that, if
the two files
iously, the process of
agram to a set of rela
of merging the primary
member records. If t
a set of (relational)
ed. We have yet to sp
keys of the owner fil
ation derived from the
the association or re
was a part of the primary key
reducing a net-
tions in first
key of any owner
hen the (network)
relations in first
ecify whether
e take part in the
member file.
lation between
of the member
-54-
file, then the merged primary key from the owner will become
part of the primary key of the relation derived from the
member file. Consider the company/employee network diagram.
Company
Employee
Figure 7
Assume company number is the primary key of the company file.
Now there are two possible cases for the primary key of the
employee file. If an employee number is unique (i.e.,
social security number is used), then the primary key of the
employee file is simply the employee number. If on the other
hand employee number is only unique within a company, the
employee number and the company/employee relation constitute
the primary key of the employee file. If this network
structure were reduced (normalized) to the relational model,
then, in the case of the universally unique employee number,
the primary key of the employee relation would only be the
employee number. If, on the other hand, the company/employee
relation was also part of the employee file primary key,
then the resulting employee relation would need both company
number and employee number to make up its primary key.
-55-
What does
form? The prob
normal form can
normalizing an
structure. The
than just the p
this have to do with second
lems to be overcome by both
be viewed as the result of
accu
err
rima
company location a
to the employee fi
be in either secon
If the cause
form is the same a
third normal form,
The only distincti
the normalization
is part of the pri
of the error takes
any candidate key)
relation is not in
not pa
second
rt of
norm
prim
form
s
1
d
0
s
e
m
a
be in third normal form.
rate (in terms of the re
or is to merge into the
ry key of the owner. Fo
well as company number
e, then the resulting re
or third normal form.
f a relation's not being
the cause of a relation
what is it that distingu
n between the two depend
rror occurred on a (netw
ary key of the member fi
part in the primary key
of the member file, then
second normal form. If
ry key, then the relatio
(assuming no other error
Viewed with this
and third normal
second and third
an error in
al world) networ
member file more
r example, if
were normalized
lation would not
s on
ork)
le.
(or
the
the
n wi
s) b
pe rs
k
second normal
ot being in
s the two?
whether or n
relation whi
If the relati
more precisel
resulting
relation is
11 be in
ut it will no
pective, the
ot
ch
on
y
t
distinction
trivial.
between second and third normal form is fairly
-56-
Algebra and Calculus
Codd calls the
earlier a relational
of this algebra are,
the extraction of de
operations on rela
algebra.22 Queri
i
si
n essence, speci
red information.
tions described
es formulated in terms
fyi
H
a proce
has also
dure for
developed
ms a relation
eries can be
23 Th
proposal of the proper i
the system, a high level
illustrate that this rel
complete" (can be formul
Codd defines a reduction
relational calculus and
relational algebra. 2 4 T
which can be formulated
be formulated in the rel
We will not pursue
calculus and reduction a
reader is referred to Co
of Data Base Sublanguage
be found in the chapter
interested in the succes
target language for inte
al calculus which provides a language
formulated in descriptive rather than
relational calculus represents Codd's
nterface between user queries and
, precise, description language. To
ational calculus is "relationally
ated in terms of the relational algebr
algorithm to take statements in the
reduce them to statements in the
hus
in
ati
the
he
the
ona 1
det
demonstrates that
relational calcul
algebra.
ails of the relat
lgorithm here.
dd's paper, "R
S..25 Altered
on the network
s of the relat
rpreting user
The
el ati
versi
cal c
ional
queri
intona 1
ons
ul us
cal
es a
a),
any query
us can also
ional
erested
Completeness
of these will
Those
culus as a
re referred to
-57-
what he ter
in which qu
procedural 1
Chapter 4 Viewpoints for Comparison
Introduction
When comparing
at the outset the po
proceed. Aspects wh
respect to one point
The goals of the vie
various issues. To
we will begin with e
how its goals are sa
Another approach wou
model and analyze th
hurt by each. This
final evaluation and
two
intich
of
wpo
ins
xpl
tis
ld
em
app
it
model ,
of view
may be
view ma
int will
ure that
icitly s
fied or
be to it
in terms
roach, h
it is important to clarify
from which the comparison will
useful or beneficial with
y be disasterous from another.
be crucial in sorting out the
this procedure is followed,
tated points of view and analyze
not satisfied by each model.
emize characteristics of each
of who or what is aided or
owever, leads to difficulty in
is often hard to ascertain which
characteristics will have important
With the former approa
results in each viewpo
The following dia
comparison of the two
could be effected by t
they will be dealt wit
ch, we c
int.gram wil
model s.
he model
h one at
or meaningful impacts.
an come to more conclusive
1 establish a framework fo
All relevant entities wh
choice are illustrated an
a time.
r
i ch
d
-59-
Figure 8
Database Administrator/Des igner
The database designer is t
mining what information is to b
that information is to be organ
that the designer insure that t
representat
In the
real world
form. 27 As
functional
from the sy
involved in
Codd p
each of the
the system
The answer
ion
he one responsible for deter-
e stored in a database, and how
ized. It is extremely important
he organization is a true
of the real world.
relational framework a
is equivalent to having
discussed in Chapter 3
and transitive dependen
stem designer's point of
removing these depende
rovides a mathematical
se dependencies looks 1
purge itself automatica
is no because knowledge
-60-
true repres
relations i
this means
cies have be
view, what
nc i e s?
formulation28
ike. With t
lly of these
of the real
entation of the
n third normal
that all
en removed. So,
is the process
to expres
his, then
dependen
world
s what
, can
cies?
required. Hence the designer must interact
with the system to remove them. The best a system can do to
le
sy
de
tr
Sm
In
de
se
er
th
as
th
re
th
ea
ot
ad the desi
stematic se
ncy in the
ue dependen
ith has act
a database
pendencies
ssion is li
How do t
rors in the
eir ways in
formulated
e framework
sult of sto
at does not
ch relation
her relatio
that the
to think
probably
cati
all
The
gner
ries
curr
cy.
ual 1
of
is h
kely
hese
rep
to t
by
of
ring
rep
is
ns i
user wishe
about onl
designs i
through the process
of questions, whethe
ent database formulat
If so the removal is
y devised a strategy
any complexity, the n
uge and hence, the qu
to be a tedious one.
functional and trans
resentation of the re
he relational organiz
the designer? The an
the relational model.
an attribute of an e
resen
cons i
n the
to
one
thi
that
ered
datab
keep
rel
nkin
on or report, it is a
the information requi
disjoint nature of a
en
to
ase
t
b
,
trac
ti on
in
is
r e
ion
ra
for
to
ach
is
the
as
ask,
pot
in
r si
king
thro
enti a
fact
mple.
the
ugh a
l depen-
,a
Grant
questions.
umber of potential
estion and answer
itive dependencies
al world, ever mak
ation o
swer is
Depen
ntity i
ity. In the
e a table dis
which stores
k of. Thus t
at a time.
terms of some
natural process to
red by the applicati
relational database,
-61-
,
e
f the database
inherent in
dencies are the
n a relation
relational model,
tinct of all
associations
he user is led
Since he
desired appli-
include in it
on or report.
then is the
counterparts i s
cause of the introduction of functional and transitive
dependencies. The user is encouraged not to think about
associations between relations.
In the network model, the designer is forced from the
outset to consider the relationships between files. Before
he can begin storing attributes he must structure the database
to represent the relationships between entities in the real
world. With the overall picture in mind, he decides which
attributes are needed and where they should be stored. It
is difficult indeed to store an attribute in the wrong file,
the error almost stares you in the face. Although a series
of questions could easily be devised to check the validity of
each attribute, it is unnecessary because mistakes are rare
or nonexistent.
Thus insuring the con
a natural outcome of utili
entails a tedious check-up
is the inherent psychology
makes the relational model
than the network model.
A Framework for Language I
Codd has classified t
a database management syst
oriented, algebra oriented
sistency of a database structure
zing the network model, whereas i
job in the relational scheme. I
underlying each approach that
more prone to this type of error
nterface Comparison
he languages used to interact with
em into three categories, cursor30and calculus oriented. They
-62-
exhibit a hierarchical ordering as depicted below.
Figure 9
A Cursor oriented language is one in which the user
specifies a step by step procedure for extracting the infor-
mation of interest. In the network model a procedure
specification to list the components of an assembly might
appear as follows.
Find part xx
Find first component of part xx
loop: If not found, go to done
Print quantity and component number
Find next component of part xx
Go to loop
A similar procedure would exist for the relational model 3 1
-63-
Find first tuple of relation component
If no more tuples, go to done
If assembly Part number=xx
Then Print quantity, Component Part number
Get next tuple of relation component
Go to loop.
Thus the system is
of primitive
An algeb
abstraction o
are provided
entire relati
(nested perha
the data of i
can be found
A calcul
procedure spe
tics of the d
descriptive r
Examples of cal
Chapter 6.
Each model
three levels to
fully developed
given a systematic procedure via a series
steps for extracting the d
ra oriented language is a
f the procedure to extract
to operate on aggregate gr
ons in the relational mode
ps) can express the entire
nterest. Examples of rela
in Chapter 3.
us oriented language is on
cification and merely desc
ata desired. The system i
equest into a procedure fo
culus oriented requests are to
of data
provide
a relati
should
a basis
onal al
of interest.
cise mathematical
formation. Functions
s of data such as
A single formula
ocedure to derive
nal algebra functions
ata
con
in
oup
l.
pr
tio
e which is free of any
ribes the characteris-
s required to map the
r data extraction.
be found in
have a language at each of the
for comparison. Codd has
gebra and a relational calculus.
-64-
loop:
Network designers have developed a myri
languages and a handful of network alge
calculus has yet to be published (see C
when comparisons are made between the t
the user interface, they are invariably
relational calculus of Codd with a curs
language (usually that proposed by the
diagram illustrates the current situati
ad of cursor oriented
bras, but a network
hapters 5 and 6). Th
wo models in terms of
comparing the
or oriented network
DBTG). The following
on.
relational comparison network
specified
--- unspecified
Figure 10
This comparison between
network cursor oriented
comparison of the two mo
models of data requires
and network calculus. T
languages on the same le
the
lan
del
the
hen
vel
relational calculus and a
guage is clearly invalid as a
s. To reliably compare the two
development of a network algebra
a more valid comparison between
can be made.
-65-
us,
Experienced User
An experienced user
database on a frequent ba
one who will
s, usually as
interact
a result
Someone in this
initial educatio
useability of a
common argument
relational model
is most likely t
simple table at
give the user an
relate to each o
that the system
the author's con
understanding of
category
n
da
of
i
ru
so
u
th
if that
tabase
the re
s easie
e, for
me time
ndersta
er; he
can
ten
i gu
on
is willing to undertake a
will significantly impro
management system in the
lational model advocates
r to teach to first time
everyone has been present
in his life. But this v
nding of the ways individ
must be content with the
re ev
that
rything
frequen
how the database
for hi
ser wil
fits together
ddi ti onal
ve the
future. A
is that the
users. This
ed with a
iew does not
ual relations
assurance
It is &
desire an
and that the
network model will better provide this than the relational
model. To achieve the same understanding provided by the
network model, the user must be taught the algebraic functions
of the relational model. This requires a fairly mathematical
mind and makes fully understandingthe relational model a
more difficult task.
Due to his frequent encounters with the system, an
experienced user is likely to prefer a good algebra oriented
language because it provides the most concise expression of
his request. His familiarity with the system and under-
-66-
w i t h
of hi
a
s job.
.
standing of the relationships in the database lead h
think of his requests in a procedural fashion. He n
thinks of the steps required to satisfy his requests
can in fact be a method whereby be clarifies his req
himself. Because of its more powerful (and hence ad
more complex) basis, the network model is likely to
a more powerful algebra oriented language. The user
think in terms of access paths through the database
extracting data on the way. A language devised by M
Inc. can provide an example of a good algebra orient
language.32 Given the following database structure,
command to provide a list of all parts and due-dates
purchase orders under vendor IGG is expressed below.
in to
atural ly
, which
uest to
mi ttedly
provide
can
structure,
ITROL,
ed
the
on
.
ENTER REQUEST: print part-n
IGG po all 1
um due-date for vendor
ine-item all.
Figure 11
-67-
This request has
reads quite well.
{vendor, po, line
the requested dat
understanding of
successful among
little superfluous
The "for clause"
-item}, and the "f
a. This language,
the network model,
MITROL customers.
Casual User
The bulk of the arguments
model
estima
have been oriented to the
ted the trend of database
information, and yet it
specifies the access path
ield clause" specifies
though requiring some
has proved very
in favor
casual
usage i
of the relational
ser. Codd
to the 199
has
0's, and
a rapidly growing use
article, "Seven Steps to R
Codd defines a casual user
with the system are irregu
his job or social role." 34
any more than a cursory in
the simpler the model the
the easier a model is to t
in fulfilling the needs of
casual user fit in to the s
language interfaces. Clea
because he wants to descri
cedure for fulfilling it.
describing a system which,
endezvo
as
lar in
Thus
itial e
better.
each to
a casu
cheme o
rly he
be his
u
t
h
d
cas ua
s wit
one
ime a
e is
ucati
The
nov
use
the
at
1 users."" In his
h the Casual User,u
whose interactions
nd not motivated by
unwilling to undertake
on. If this is true,
argument is then that
ice, the better it is
r. Now where does the
three categories of
the calculus level
request, not gi
Codd does an
based on the
-68-
excel len
rel ati on
ve a pro-
t job of
al calculus,
predicts
carries on a
understands
until the system fully
the user's request. 35 The user is able to
converse with the R
as long as the syst
responses. Now fin
concept of a model
only one using a mo
mation he wants. A
filing strategies o
information he want
the
led
the
one
as
the
model
us to
casual
then
t woul
casual
needed
the po
user,
maybe
d appe
to
intif
no
ar
user cannot
end
m
d w
ezvo
can
here
at all!
d
f
s
el , the
manager
his sec
and the
satisfy
of sayin
a simple
model is
ystem in
ve some
the dial
n re s
form
i the
trictea Eng
ation from
user neede
It seems that the system i
user merely requests the i
does not need a model of t
retary, he merely requests
secretary maps the reques
it. Thus, indirectly, Cod
g, from the point of view
model i.s better than a co
best of all. If this hold
to in the sample Rendezvous
be used as a basis
s
If
he
t
t
d
i s h
is
any
the
or-
he
into
has
1 ex
true
dialog, then
of comparison
between the two models of data.
leads to a better interpretation
system.
The issue
of a user
is which model
request by the
The System
Without algorithms to decipher a user request in both
models, an accurate comparison of the network and relational
models in terms of which model leads to easier analysis cannot
be made. However, general comments bearing on the issue can be
-69-
dialog with the user
raised.
Rel ationships
bearing entities a
in the network model a
bsent in the relational
re information
model. These
relationships
phrase his En
relationships
ships will ma
that a relati
attributes) i
In the relati
serve the fun
indirectly.
key in the re
to other doma
gi
ke
on
n
on
ct
A
la
in
ave names a
ish request
Thus is is
request an
ship relate
one file to
al model , i
ion of rela
role name
tion. It i
s of tuples
n
a
s
t
t
a
s
d it is likely
in terms of th
likely that th
lysis a simple
a whole entit
a whole entity
could be argu
ionships, but
pplies directl
.another step
found in the c
that a user w
ese names of
ese named rela
r job. Note a
y (all of its
in another fi
ed that role n
they do so on]
y only to the
to apply a rol
ither relation.
Consider the part/product
model there are two files with
Pa
assembly
ProStr
(only "down" na
structure case. In the network
two relations between them.
rt
component
ductucture
mes are given)
Figure
-70-
ill
ion-
so
orei gn
name
there would be two relations, part
and product st
the part file.
assembly part
quantity field
components
because
user's
given (
name fo
request
product
ructu
The
n umbe
Co
of part
the "
reques
part a
r the
is re
struc
down
t.
nd c
sys t
ally
ture
In
om
em
s
these (network) rel
deciphering user re
Another aspect
tency and integrity
motivated by the fo
relation and a purc
database. One of t
is likely to be the
orders exist for a
disallow the deleti
re. The part rel ation is identical
product structure relation has the
r and component part number as well
nsider now the user request, "List
xx." In the network model this is
name of one of the relations matche
the relational model, the key word
ponent), do not give the necessary
to pick up on, assembly. The rela
"List the compone
having assembly p
ation names.can p
quests.
of the system's
of the database.
llowing
hase ord
he domai
vendor
parti cul
on of th
example
er rela
ns of t
number.
nt
ar
ro
part number of
ai
tr
s
s
ro
ti
al
the
the
vial
he
e
nal
t number xx. Thus
vide added ease in
job is insuring consis-
One such problem can
Assume there is a vend
on in the relational
purchase order relatio
As long as any purchase
vendor, we would
vendor from the
be
or
ike to
tabase.
-71-
In the relational model
the relational model, this
the network model it falls
enforcement must be external.
out naturally.
Application Programs
One of the major arguments for the
that it provides data independent access
tion programs will remain unaffected by
relational model
ing so that appl
growth in the
In
is
i ca-
database. In his "Fu
Relational Model," Co
which changes to a re
tion programs.36 The p
migration and inserti
a network model free
the network model can
relational model in i
claims the relational
form, the author clai
state which
the argume
ac
nts
rt
dd
la
ri
on
of
b
ts
m
ms
curatel
along
her Normalization
gives several ex
tional database w
mary cases have t
and detection an
ordering and ind
e shown to be no
ability to handl
odel should first
the network mode
y reflects the re
these lines deal
of
ampl
oul d
o do
onia 1
exi n
wors
e gr
be
1 sh
al w
with
the Data Base
es of cases in
impair applic
with attribut
ies. By using
g dependencies
e than the
owth. (Codd
in third norma
ould first be
orld.) The bu
particular
systems which
or treat the
a model of en
possess these o
network model as
tities and their
rdering or i
a model of
relationshi
ndexi ng
storing
ps.
dependen
records,
-72-
a-
e
,
cies,
not
Chapter 5 A Network Al gebra
Introduction
There are
ing a network
patterns in a
there. The se
posed by Codd,
Th
an
su
ca
po
be
pa
ha
al
e f
en
ppo
1 cu
wer
di
ths
ndl
geb
two possible approaches to be taken in develop-
algebra. The first consists of analyzing usage
network cursor language and proceeding from
cond is to take the relational algebra as pro-
and modify it to fit the network framework.
irst approach is preferable
d in itself. If, however,
rt level with respect to th
lus, then the second scheme
of the resulting algebra o
scussed in the next section
through a network database
e certain queries. The str
ra, as modified here, will
if the resulting algebra is
the algebra is to play a
e development of a network
is preferable. The issue of the
r calculus is important. As will
, the common approach of following
is not powerful enough to
ucture of Codd's relational
take care of these problems.
Thus the second approach, modifying Codd's relational
to fit the network framework, will be used.
algebra
Access Path Navigation
Bachman has described the programmer's job as one of
navigating through the access paths of the network structure.
In most network systems, the user begins at the root of the
database and,following the arrows of the data structure
-73-
diagram, defines a path to the
cases this works quite well, as
There is one kind of quest
which this navigation procedure
Fortunately for network users,
It will be denoted the "For all
database structure.
data he requires. In most
experienced by MITROL users.38
ion known to the author in
fails; there may exist others.
this type of question is rare.
" type. Consider the following
Figure
Now would one answer the query, "List all the suppliers who
supply all projects," by simply navigating via access paths?
(This is where division comes in handy.)
Extension to the Relational Algebra
Setting aside for the moment the one to one and one to
many relations of the network model, we see that the remaining
files and fields are direct counterparts of the relational
relations and domains. Thus we can start with the operations
of the relational algebra as a base. To handle the relations
we must add a new facility, the MERGE operation.
-74-
Let R and S be two
r and s are tuples
of R and S over T i
R &
where
s via
ne twork files related by relation T. If
from R and S respectively, then the MERGE
s denoted and defined
S = {r s:rER A
T(r,s) is true
relation T.
sS A T (r,s)}
if r is related to
This can be seen to be the counterpart of the na
the relational algebra. The notation given this
intended to clarify the fact that the MERGE can
a limited cartesian product. The operation is c
normalization of the two files.
To make the MERGE operation complete, we mu
notation to include the capability to merge seve
once. A straightforward extension
T 1S
T2S2
tural join
operation
be viewed
learly the
st
ra
extend
files
in
is
as
the
at
T3S3 9 4
assumes that t
the associatio
related S4 to
this, the inde
taking part in
merge symbol.
e last tuple added is the one taking part in
of the next merge. If T3, for example,
2' then the notation is inadequate. To handle
of the basic tuple in the resulting merge
the merge relation will be placed under the
Also, we must have a way to differentiate
-75-
between the own
To handle this
The arrow will
merged is in th
tion. For the
full extended n
Ts
S
er and member s
we will place a
point to the ri
e member role w
complementary c
otation is thus
T 2 T3
S2 ( S3 ( S42 2
ide of a one to many rel
n arrow under the merge
ght if the file being cu
ith respect to the merge
ase it will point left.
For completeness, and to make the reduction from the
network calculus a
the restriction op
proposed by Codd,
compari son operato
testing function w
have been concaten
are related via a
include as part of
the first domain.
or file, this migh
operator would tes
their correspondin
relation. Thus if
s easy as possible,
erati on
the r
rs <,
hich
ated
rel at
any
In t
t be
t the
g tup
e s
de
(b
io
tu
he
si
se
le
of the relational
tri cti
, =,
termin
y a ca
n. On
ple a
array
mply t
s
two d
were
we must also extend
algebra.
defined on the
t to add a
tuples which
for example)
do this is to
er, perhaps as
of a relation
comparison
on operator is
, >, >. We wan
es whether two
rtesian product
e simple way to
unique identifi
representation
he index. The
omai
rel
ns to
ated
rmine
given
whethe
(netwo
tupl es
ete
a
A and B are the domains identifying
s in a relation T which has been formed from relations
S, then
-76-
ati on.
symbol
r r e n t 1 y
rel a-
The
r and
R and
r
rk)
W WT [A # B] (T [A fl B])
results in a relation of only the tuples of T for which
original tuples r and s
corresponding to the own
The MERGE operator
operator are somewhat re
ferable in a practical s
size of a target relatio
product). This will bec
chapter. Due, however,
this presentation, the a
were (not) rela
er tuple will
and the exten
dundant. The
ense because
n (compared t
ome more clea
to the lack o
uthor is unsu
ted by W. The domain
always appear first.
sion to the restriction
MERGE operator is pre-
it limits the initial
o the full cartesian
r in the network calculus
f mathematical rigor of
re that it will always
be sufficient
used because
. Thus the
it provides
restriction modification will be
mathematical completeness.
Examples
To illustrate the use of the network algebra, consider
the following database.
W3 W4
Figure 14
-77-
Symbol File Domain 1 Domain 2 Domain 3
supplier TID* Supplier # Supplier Name
R2 part TID Part # Part Name
R3 project TID Project # Location
R4 supply TID
R5 worker TID Worker # Worker Name
* Tuple Identifier
Figure 14 (cont'd)
The following queries will be
algebraic formulas required to
translated into the network
satisfy them.
* Find
supp
the
ly p
supplier
art 15.
numbers of those suppliers who
Wl
((R R
1
W2
O R2) [6 = 1] {15}) [2]
2
* Find the name of suppliers and the parts being supplied
by them (omitting those suppliers who are supplying no
parts at this time).
-78-
1
(R1 Q1
R4 0:2
R2 ) [3,6]
* Find the workers on all projects supplied by supplier A.
Wi
((R
1
W3
R4
2
W4
R3 0
3
R5 ) [3 = 1] {A})
-79-
[9]
Chapter 6 A Network Calculus
Introduction
Just as with the network algebra, the network calculus
presented here will be a simple extension of the relational
calculus developed by Codd.39 The presentation will closely
follow his, but with the slight modification included.
Because the presentation given here is fairly terse, the
reader is suggested to refer to Codd's presentation for
clarification.40
The Extension
All that is necessary to convert
calculus to a network calculus is the
predicate constants W1, W2, W3 , ...
for each (network) relation in the da
development, they can be handled exac
predicate constants <, <, , / , >, >
take tuple variables as opposed to in
side. We shall follow the convention
shall appear on the left.
Codd's relational
addition of dyadic
to the alphabet, one
tabase. Throughout
tly as the dyadic
are, except that the
dexed tuples on eith
that the owner tupl
The Network Calculus
The alphabet for the network calculus is listed below.
-80-
the
y
er
e
Indi vi dual constants
Index constants
Tuple variables
Predicate constants
monadi c
dyadi c
Logical symbols
Del imi ters
,a2,
2, 3, r2, 5
PI P2 '=, <, >
W1, W2 ,
3 , A[] I )3
a3
'4
r 3
'3'
<,
W3 ,
)V)
There is a monadic predicate constant for each file
(relation) in the database, and P r is intended to mean
that tuple r is a member (row) of file j. P.r is called a
range term.
An indexed tuple, denoted r[n] where r is a tuple
variable and n is an index constant, is used to identify th
n th domain of tuple r.
If 6 is one of the dyadic predicate constants =, <, <,
>9 >, , and X and y are indexed tuples, then X6i and Xea
are called join terms. If e is one of the dyadic predicate
constants Wi, W2, W3, ... , and X and u are tuples, then AOi
is called a merge term. The only terms of the network
-81-
e
calculus are range terms,
Codd's definition of
appl ies
1. Any term is a W
2. If r is a WFF,
3. If F , r 2 are W
4. If F is a WFF i
variable, then
5. No other formul
A ra
ter
free
ver
join terms, and merge terms.
a well-formed formulae (WFF) still
FF;
then so
FFs, so
n which
3 r(1")
ae are
is ,
are
r oc
and
WFFs.
(Cr Al
curs a
Vr ()
"41
nge WFF is a quantifier free WF
Ms. A range WFF over r is a ra
variable. A proper range WFF
r such that:
s
2) and (r' V Pr 2)
a free
are WFFs;
F whose only terms are
nge WFF with r as the
over r is a range
'' occurs only after
if r is part of more
relations associated
union compatible.
A , and
than one
with the
range term, the
predicates are
Thus a proper range WFF over r cannot say
is not the source of tuple variable r. Li
a tuple variable can only be the files of
files which can be generated from them by
only that file S
kewise the range of
the database or
union, intersection,
-82-
i
range
only
WFF o
and difference operations on union compat
A range-coupled quantifier is either
where F is a proper range WFF over r.
A range-separable WFF can be written
U 1A U2 A ...
ible pairs.
31"r or vrr
in conjunctive form
A U A
where
1 . n>1 ;
2. U1 through U are proper range WFFs over nn
distinct tuple variables;
3. V is either null, or it is a WFF with the three
properties:
a. every quantifier in V is range-coupled;
b. every free variable in V belongs to the set
whose ranges are specified by U1, U2,
c. V is devoid of range terms." 4 2
To i
simple al
ncorporate the
pha expression
neede
has t
d projection capability, a
he form
(t , t2, ..
11
Un;
tk)
-83-
where
1. w is a range-separable WFF of the network calculus;
2. t1, t2 , ... , tk are distinct terms, each consisting
of a tuple variable or an indexed tuple variable;
3. the set of tuple variables occurring in t,, t2 ' '''
tk is precisely the set of free variables in w.
An alpha expression i
if t:wI and t:w2 are
form
t: (w1
t: (w1
t: (w1
either a simple
pha expressions,
alpha
an e
expression,
xpression of
V w2)
A w2
Anw 2'
Examples
Assume the following database structure.
R2 Part
R, Supplier 2 W3 R3 Project
R4 Supply R5 Wo
Figure 15
-84-
or,
the
Figure 15 (cont'd)
* Find the supplier number of those suppliers who supply
part 15.
r1 [1]:Pir, A P2r2 A
r2W2r4 A r2[1]
P4 r4 A (r1W1r4
= 15)
* Find the locations
supplied by
supplying no
of suppliers
them (omitting
parts at
and the parts being
those suppliers who are
this time).
(rl[31, r2[11) P1r P2 r2 A P4 r4
(r1W 1 r4 A r2W2r4 )
-85-
Symbol File Attribute 1 Attribute 2 Attribute 3
R1 supplier supplier # supplier name location
R2 part part # part name
R 3 project project #
R 4 supply
R5 worker worker # worker name
* Find the workers on all projects being supplied by
supplier A.
r5[1] P1r A P4r4 A P3r 3 t P5r 5 A
(r [1] = A A rIW 1 rg A r3W3r4
A r3W4 r5
Reduction
The reduction algorithm presented by Codd is intended to
demonstrate that the relational algebra is "relationally com-
plete" and is not intended as a practical efficient translator
from calculus to relational algebra.43 Because the modification
we have presented is of such a-minor nature, we will not wade
through a parallel reduction algorithm for the network calculus
and algebra. Instead, we will present arguments considered suf-
ficient to convince the reader that such a parallel reduction ex-
ists, and move on to the more interesting question of efficiency.
The only modification we made to the relational calculus
was the addition of the dyadic predicate constants W,, W2'
W 3, ... These operated on tuples in an identical manner to
the way the other dyadic predicate constants (=,/,<,<,>,>)
operated on indexed tuples. By assuming that we will assign
each tuple in a file a unique identifier (as discussed in the
last chapter) and treating that as the first domain of each
tuple, then the extension we made to the restriction
-86-
operation in the last chapter will handle the extension we made
to the network calculus.
To be specific, if we assume that the tuple identifier
domain is not assumed in the calculus level, but that the
cartesian product automatically prefixes it to the tuple,
then the following modifications need to be added to the
reduction algorithm.
Step 1.3
When
a ~1
a dyadic predicate constant W is preceded
, eliminate the , symbol and replace W by
Step 3j-1
p4. =-(E (n. + 1)) + 1S 1=1 1
Step 4 (rewriting rules)
5. (r W rk) ~S I-l
6. (r iI rk) j~1
(change)
Vk1]
Ek- ]
(add)
(add)
Optimization and Efficiency
It would be unfair at this point in the development to
declare that one model or the other is inherently more efficient.
Codd argues that the calculus level is a good starting point
for optimization, and this is likely to be true. The issue of
concern here is whether the network model or relational model
leads to more efficient execution.
-87-
The
which to
cartesian
cartesian
each will
majority
restricti
It i
products
that
use of the cartesian pr
begin. The number of t
product can quickly be
product of three relat
result in a relation w
of these are likely to
on operations.
s likely that with a li
and restrictions could
R[A e
So the more like
the join of the
algebra. In eit
performing some
a likely impleme
operation shoul c
tuples which wou
does not know ol
this same proper
ignorance of thE
implementation.
oduct is a prime suspect with
uples generated by an extended
come astronomical. The
ions with only 100 tuples
ith 1,000,000 tuples. A
be pared off in the ensuing
ttle effort, the cartesian
be replaced by joins. Note
BIS = (R 0v S) [A e B].44
ly efficiency comparison should be
relational algebra and the merge of
her case improvement could also res
restrictions prior to the merge or
ntation of the network model, the m
require no more accesses than the
ld end up in the resulting file. T
a relational implementation which
ty, but this is more likely a funct
author than the non-existence of s
Thus we cannot pursue this point a
between
network
ult from
join. In
erge
number of
he author
would have
ion of the
uch an
ny further.
-88-
Chapter 7 Summary
Conclusions
Once again, the subjective nature of a comparison of this
type must be stressed. Any conclusions drawn must be phrased
in terms of the issues discussed. The relevancy and importance
of the topics chosen for study here are matters for the reader
to evaluate. The author made an attempt to cover the areas of
concern most frequently found in the literature, and this may be
some basis of argument for the relevancy of the topics. It is
the author's personal opinion that the discussion has hit many
of the important issues and has done so with a viewpoint not
commonly found in current debates.
Based on the five viewpoints chosen for comparision, all
but the casual user point of view leaned toward a preference
for the network model; for the casual user it was a draw. This
is perhaps an appropriate point to emphasize that all aspects
of the network model used for the comparisons are not, to the
author's knowledge, to be found in any published exposition of
the model. Some may argue that this makes the comparisons in-
valid, but the author contends rather that looking at each model
in terms of its ultimate possibilities makes the comparison
more meaningful, if the goal is in fact to choose the best
approach. Rather than focussing on correctible deficiencies
in a current model, we have tried to look at the underlying
-89-
inherent c
results wi
ciency is
haracteristics of each model with
11 not become outdated as soon as
remedied.
the hope that
a particular
To provide a
to the interface
developed a netwo
simply extensions
extent non-rigoro
for some, a point
tensions proposed
treatment by anyo
More importantly,
toward more "equa
One question
n adequate basis of comparison with respect
language, we ha
rk algebra and
to Codd's). T
us, and it is s
of contention.
will serve as
ne with an abst
we hope that t
1 basis" compar
that arises is
ve
ne
he
us
th
ra
hi
is
w
followed Codd's lea
twork calculus (whic
approach was to a c
pected that this may
The hope is that
e basis of a more ma
ct mathematical back
s will provide an im
ons in the future.
hether one model can
i and
h were
erta in
provide,
the ex-
thematical
ground.
petus
viewed
subset of another.
issue.
An excerpt from Codd has bearing
"Claims have been made ..
permit(s) more natural or
real world than the relational
not easy to support or refute,
edge of what constitutes a goo
a given class of problems is h
atic.
. that the network approach
faithful modelling of the
model. Such claims are
because our present knowl-
d data structure for solving
ighly intuitive and unsystem-
"However, we can observe that many different kinds of
-90-
and
the
defi
topology, and graphs (or
today for solvi
to be neutral t
and yet very ad
been c&monstrate
tions in variou
"On the ot
to a specific k
convenient in s
It is convenien
of sets, each o
total ordering
appl ic
loops
with n
binary
variab
charts
ation inv
(e.g., tr
etwork li
rel at ion
le depth,
) .1"45
ng "real
owards t
aptabI e
d rather
s kinds
her hand
world" oroblems. Relations tend
hese problem-solving representations
to supporting any of them. This has
clearly in applications of rela-
of graphics packages.
, the owner-coupled set gives rise
ind of network, and is accordingly very
ome contexts and very awkward in others.
t when the application involves collections
f which has both a descriptor and a simple
of its elements. It is awkward when the
olves partial, ordering.s (e.g., PERT charts),
ansportation routes), values associated
nks (e.g., utility networks), many to many
s, relations of degree other than two, and
homogeneous trees (e.g., organization
Codd proceeds to argue that
simple, and particular user
be the responsibility of the
provide a clean separation t
uncluttered. He is thus say
structure which can be built
This could well be true, but
the principal schema should be kept
needs for more complex schemes should
user schema. This, he argues, would
hat would keep the principal schema
ing that the network model is a
on top of the relational model.
the relational model is trying to
-91-
networks) are in usegeometry,
accomplish the same ends as the network model, and
around the network model. The thesis underlying th
been that the network model, as presented here, is
to model the real world. As Codd points out, this
to support or refute, but, if it is true, the netwo
provides a higher interface to use and hence should
greater ease in handling real world problems.
ence skirts
s paper has
uffi ci ent
s difficult
k model
provide
A Hybrid View
Codd points out
the relational model.
relational view in a
that the network view can be
It is likewise possible to
system built on the network
supported by
support the
model. The
basis
a proc
of thi
edure
s asse
(a la
rt ion.
Codd)
comes
for tr
f rom
an s fo
the fa
rming n
into a relational database (normalization).
performing this normalization, the network s
appear to the user that this has been done a
base is built on the relational view. Thus,
to think of a database as relational (e.g.,
there is no reason to rule out the support o
system has been built on a network framework
bility we can have the best of both worlds.
A special case in which the hybrid view
is in the context of distributed databases.
that th
etwork
Wi thou
ystem c
nd that
if it
the cas
f it ev
With
seems
Consid
ere exi
databas
sts
e
t actually
ould make it
the data-
behooves one
ual user),
en if the
this possi-
a p pro
er a
priate
user
database which maintains information on his current portfolio.
-92-
major concern with respect to a stock is its
current price. This is an
track of continuously, espe
would more likely be desira
maintain current prices on
user could, at any time, ac
ular stock (for a slight fe
to view an individual user'
and the interaction betweer
To find the stocks in the L
extremely expensiv
cially for a singl
ble to have a sing
all stocks, and an
cess the current p
e, of course). It
s database in the i
databases in a re
iser's portfolio we
Y
r
value to keep
user.
e vend
indiv
ice of
may be
etwork
ati ona
would
It
or (user)
idual
a partic-
hel pful
model,
1 context.
employ
network
database
tec
we
hniques
woul d
, and to
make use
find the stock price in
of relational operators.
a remote
Further Research
This paper has touched on several ideas wh
themselves make interesting bases for research.
this paper was much too broad to cover many of
detail. A few of these are suggested below.
ich would
The scope of
them in any
* The whole concept of keys in the
source of problems in several im
tioned in Chapter 2, keys serve
network model, ordering and iden
a relation, and identification o
the hierarchical model of data,
-93-
network model
plementations.
two functions
tification of
f records in a
out of which t
is a
As men-
in the
members of
file. In
he network
An attribute of
model evolved, this duality of function caused no problems.
It is the potential for multiple access paths to a record
that spawns the problem, and the duality of function
should thus be clearly separated. How this can be done,
and its impact on current use of the network model would
be an interesting study.
* As mentioned in Chapter 2, one to many binary relations
can be used to represent many to many and n-ary
relations. A mathematical formulation and proof of
this assertion could prove interesting. Also the
need for one to one relations alluded to could be
verified..
* A psychological study of the "teachability" of the two
models could help to reinforce or quell many arguments.
* Many of
algebra
mati cal
the ideas presented in developing the network
and calculus could stand a more formal, mathe-
treatment.
* Techniques
rel ational
significan
for optimi
calculus i
t practical
zation of both the network and
n the reduction process would be of
value.
-94-
* Any comparisons of the two models on terms other than
those presented here would help to round out the picture
and fill in the gaps. These should not have any of the
pitfalls discussed in the introduction.
-95-
Footnotes
1. Codd, E.F., "Seven Steps to Rendezvous with the Casual
User," Proceedings of the IFIP TC-2 Working Conference
on Data Base Management Systems, Cargese, Corsica,
April 1-5, 1974, North-Holland, Amsterdam.
2. Clemons, Eric K., "The Design of Languages for Management
Information Systems: A Proposal for a Disseration,"
Cornell University, March 12, 1975.
3. Madnick, Stuart E., "The Future of Computers," Technology
Review, July/August, 1973.
4. Simon, H.A., "The Architecture of Complexity," Proceedings
of the American Philosophical Society, 106, 6, December,
1962, pp 467-482.
5. Codd, E.F., "A Relational Model of Data for Large Shared
Data Banks," Communications of the ACM, Volume 13,
Number 6, June, 1970, pp 377-387.
6. Bachman, C.W., "Data Structure Diagrams," Data Base
(Quarterly News Letter of ACM-SIGBDP) Volume 1, Number 2,
1969.
7. Bachman, C.W. , "The Data Base Set Concept: Its Usage and
Realization," Honeywell Information Systems Internal Report,
January 31 , 1973.
8. Bachman, C.W., "The Programmer as Navigator," Communica-
tions of the ACM 16, No. 11, November 1973, pp 653-658.
9. Op. Cit., Codd, E.F., "A Relational Model of Data for
Large Shared Data Bases."
-96-
10. Codd, E.F.
Rel ational
"Data Base
Prentice H
1 . Codd, E.F.
Tutorial,"
on Data De
available
12. Codd, E.F.
Sublanguag
"Data Base
Prentice H
13. - Codd, E.F.
Programmer
Proceeding
from ACM,
14. Op. Cit. ,
"Further Normalization of the Data Base
Model," Courant Computer Science Symposia 6,
Systems," New York City, May 24-25, 1971,
all.
"Normalized Data Base Structure: A Brief
Proceedings of the 1971 ACM-SIGFIDET Workshop
scription, Access and Control, San Diego,
from ACM, New York.
, "Relational Completeness of Data Base
es," Courant Computer Science Symposia 6,
Systems," New York City, May 24-25, 1971,
all.
, and C.J. Date, "Interactive Support for Non-
s: The Relational and Network Approaches,"
s of the 1974 ACM-SIGFIDET Workshop, available
New York.
Codd, E.F. , "A Relational Model of Data for
Large Shared Data Banks."
Ibid
Ibid
Op. Cit., Codd, E.F., "Relational Completeness of Data
Base Sublanguages."
Ibid
Ibid
-97-
20. Op. Cit., Codd, E.F.,
Base Relational Model.'
21. Op. Cit., Codd, E.F.,
A Brief Tutorial."
22. Op. Cit., Codd, E.F.,
Bases Sublanguages."
23. Ibid
24. Ibid
25. Ibid
26. Op. Cit., Codd, E.F.,
the Casual User."
27.- Op. Cit. , Codd, E.F.,
Base Relational Model.
28. Ibid
29. Smith, Grant N., "Deci
Generation of Storage
Systems," Sloan School
Cambridge, Massachuset
30. Op. Cit., Codd, E.F.,
Base Sublanguages."
31. Op. Cit., Codd, E.F.,
for Non-Programmers:
'Further Normalization of the Data
"Normalized
"Relational
"Seven Steps
Data Base Structure:
Completeness of Data
to Rendezvous with
"Further Normalization of the Data
sion Rules for the
Strategies in Data
of Management, Ma
ts, June 1975.
"Relational Comple
Automated
Management
sters Thesis,
teness of Data
and C.J. Date, "Interactive Support
The Relation and Network Approaches."
-98-
32. MITROL, Inc., "MITROL
Request," available f
33. Op. Cit., Codd, E.F.,
Casual User."
34. Ibid
35. Ibid
36. Op. Cit., Codd, E.F.,
Base Relational Model
37. Op. Cit., Bachman, C.
38. Op. Cit., MITROL, Inc
The Print Request."
-39. Op. Cit., Codd, E.F.,
Base Sublanguages."
40. Ibid
41. Ibid
42. Ibid
43. Ibid
44. Ibid
45. Op. Cit., Codd, E.F.,
for Non-Programmers:
Approaches."
Technical Reference, The Print
rom MITROL, Inc., Waltham, Ma.
"Seven Steps to Rendezvous with the
"Further Normalization
W., "The Progra
, "MITROL Tech
"Relational Co
and C.J. Date,
The Relational
mmer
n i ca
as N
1 Ref
mpletenes
of the Data
avigator."
erence,
s of Data
"Interactive Support
and Network
-99-
Bibliography
1. Clemons, Eric, K., "The Design of Languges for Management
Information Systems: A Proposal for a Dissertation,"
Cornell University, March 12, 1975.
2. Madnick, Stuart E., "The Future of Computers," Technology
Review, July/August, 1973.
3. Simon, H.A., "The Architecture of Complexity," Proceedings
of the American Philosophical Society, 106, 6, December,
1969, pp 467-482.
4. Bachman, C.W., "The Data Base Set Concept; Its Usage and
Realization," Honeywell Information Systems Internal Report,
January 31, 1973.
5. Bachman, C.W., "Data Structure Diagrams," Data Base
(Quarterly News Letter of the ACM-SIGBDP) Volume 1,
Number 2, 1969.
6. Bachman, C.W., "The Programmer as Navigator," Comm. ACM 16,
No. 11, November 1973, pp 653-658.
7. Codasyl Development Committee, "An Information Algebra,"
CACM 5, April 1962, pp 190-204.
8. CODASYL, "Data Base Task Group Report," ACM HQ,
October 1969.
9. Codasyl Systems Committee, "Feature Analysis of
Generalized Data Base Management Systems," May 1971,
available from ACM, New York.
-100-
10. CODASYL, "Data Base Task Group Report," ACM, New York,
April 1971.
11. CODASYL Data Base Language Task Group: Proposal,
February 1973.
12. Codd, E.F., "A Relational Model of Data for Large
Shared Data Banks," Communications of the ACM, Volume 13,
Number 6, June 1970.
13. Codd, E.F., "Further Normalization of the Data Base
Relational Model," Courant Computer Science Symposia 6,
"Data Base Systems", New York City, May 24-25, 1971,
Prentice-Hall.
14. Codd, E.F., "A Data Base Sublanguage Founded on the
Relational Calculus," Proceedings of the 1971 ACM-
SIGFIDET Workshop on Data Description, Access and
Control, San Diego, available from ACM, New York.
15. Codd, E.F.,
Tutorial,"
on Data Des
available f
16. Codd, E.F.,
Sublanguage
"Data Base
Prentice-Ha
17. Codd, E.F.,
"Normalized Data Base Structure: A Brief
Proceedings of the 1971 ACM-S
cription, Access, and Control
rom ACM, New York.
"Relational Completeness of
s," Courant Computer Science
Systems," New York City, Nay
11.
"Seven Steps to Rendezvous w
IGFIDET Workshop
, San Diego,
Data Base
Symposia 6,
24-25, 1971,
ith the Casual
User," Proceedings of the IFIP TC-2 Working Conference
-101-
on Data Base Management Systems, Cargese, Corsica,
April 1-5, 1974, North-Holland, Amsterdam.
18. Codd, E.F., "Recent Investigations in Relational Data
Base Systems," Information Processing 74, North-Holland,
Amsterdam.
19. Codd, E.F., and C.J. Date, "Interactive Support for
Non-Programmers: The Relational and Network Approaches,"
Proceedings
from ACM, N
20. Date, C.J.,
Approaches:
Interfaces,
Workshop, a
21. Smith, Gran
Generation
Systems," S
Cambridge,
22. MITROL,
Request
23. Bachman
16, No.
24. Stamen,
of the 1974 ACM
ew York.
and E.F. Codd,
Comparison of
Proceedings of
vailable from AC
t N., "Decision
of Storage Strat
loan School of M
Massachusetts, J
Inc.,
ava
C.W.
11, N
J.P.
and Analysi
Cambridge P
"MITRO
i 1 able
, "The
ovember
and R.
L Tec
from
Progr
1973
M. Wa
-SIGFIDET Workshop, available
"The Relational
the Application
the
M, Ne
Rules
egies
anage
une,
hni cal
MITROL
ammer
pp 6
l lace,
s System for the Behav
roject, Cambridge, Ma.
1974 AC
w York.
for th
in Dat
ment, M
1975.
Referen
Inc. ,
s Navig
3-658.
"Janus -
and Network
Programming
M-SIGFIDET
e Automated
a Managemen
asters Thes
t
is,
ce, The Print
Waltham, Ma.
ator," Comm. ACM
A Data Management
ioral Sciences,"
-102-