The Relational Model of Data Prof. Yin-Fu Huang CSIE, NYUST Chapter 2.

47
The Relational Model of The Relational Model of Data Data Prof. Yin-Fu Huang Prof. Yin-Fu Huang CSIE, NYUST CSIE, NYUST Chapter 2 Chapter 2

description

Database Systems Yin-Fu Huang Two important data models: 1) The relational model, including object-relational extensions 2) The semistructured-data model, including XML and related standards 2.1.2Important Data Models

Transcript of The Relational Model of Data Prof. Yin-Fu Huang CSIE, NYUST Chapter 2.

Page 1: The Relational Model of Data Prof. Yin-Fu Huang CSIE, NYUST Chapter 2.

The Relational Model of DataThe Relational Model of Data

Prof. Yin-Fu HuangProf. Yin-Fu HuangCSIE, NYUST CSIE, NYUST

Chapter 2Chapter 2

Page 2: The Relational Model of Data Prof. Yin-Fu Huang CSIE, NYUST Chapter 2.

Database Systems Yin-Fu Huang

2.12.1 An Overview of Data ModelsAn Overview of Data Models2.1.1 What is a Data model2.1.1 What is a Data model

Data model: a notation for describing data or information Three parts:

1) Structure of the dataa. Conceptual model

2) Operations on the data a. Queries and modificationsb. By limiting operations, it is possible for programmers to describe database operations at

a very high level, yet have the DBMS implement the operations efficiently.

3) Constraints on the data

Page 3: The Relational Model of Data Prof. Yin-Fu Huang CSIE, NYUST Chapter 2.

Database Systems Yin-Fu Huang

Two important data models:1) The relational model, including object-relational extensions2) The semistructured-data model, including XML and related standards

2.1.22.1.2 Important Data ModelsImportant Data Models

Page 4: The Relational Model of Data Prof. Yin-Fu Huang CSIE, NYUST Chapter 2.

Database Systems Yin-Fu Huang

(See Fig. 2.1)

This physical implementation is only one possible way the table could be implemented in physical data structures.

2.1.32.1.3 The Relational Model in BriefThe Relational Model in Brief

Page 5: The Relational Model of Data Prof. Yin-Fu Huang CSIE, NYUST Chapter 2.

Database Systems Yin-Fu Huang

2.1.32.1.3 The Relational Model in BriefThe Relational Model in Brief

Page 6: The Relational Model of Data Prof. Yin-Fu Huang CSIE, NYUST Chapter 2.

Database Systems Yin-Fu Huang

(See Fig. 2.2) XML: a way to represent data by hierarchically nested tagged

elements The operations usually involve following paths in the implied

tree from an element to one or more of its nested subelements, then to subelements nested within those, and so on.

The constraints often involve the data type of values associated with a tag.

2.1.42.1.4 The The SemistructuredSemistructured Model in Brief Model in Brief

Page 7: The Relational Model of Data Prof. Yin-Fu Huang CSIE, NYUST Chapter 2.

Database Systems Yin-Fu Huang

2.1.42.1.4 The The SemistructuredSemistructured Model in Brief Model in Brief

Page 8: The Relational Model of Data Prof. Yin-Fu Huang CSIE, NYUST Chapter 2.

Database Systems Yin-Fu Huang

Object-relational model1) Values can have structure, rather than being elementary

types such as integer or strings.2) Relations can have associated methods.

Object-oriented model Hierarchical model Network model

2.1.52.1.5 Other Data ModelsOther Data Models

Page 9: The Relational Model of Data Prof. Yin-Fu Huang CSIE, NYUST Chapter 2.

Database Systems Yin-Fu Huang

It appear that semistructured models have more flexibility than relations.

However, 1) SQL enables the programmer to express their wishes

at a very high level.

2) The short SQL programs can be optimized to run as fast, or faster than the code written in alternative languages.

2.1.62.1.6 Comparison of Modeling ApproachesComparison of Modeling Approaches

Page 10: The Relational Model of Data Prof. Yin-Fu Huang CSIE, NYUST Chapter 2.

Database Systems Yin-Fu Huang

2.22.2 Basics of the Relational ModelBasics of the Relational Model

(See Fig. 2.3)

2.2.1 Attributes

2.2.2 Schemas Movies(title, year, length, genre) Database schema

2.2.3 Tuples (Gone With the Wind, 1939, 231, drama)

Page 11: The Relational Model of Data Prof. Yin-Fu Huang CSIE, NYUST Chapter 2.

Database Systems Yin-Fu Huang

2.22.2 Basics of the Relational ModelBasics of the Relational Model

Page 12: The Relational Model of Data Prof. Yin-Fu Huang CSIE, NYUST Chapter 2.

Database Systems Yin-Fu Huang

2.22.2 Basics of the Relational ModelBasics of the Relational Model

2.2.4 Domains Each component of each tuple should be atomic.

2.2.5 Equivalent Representations of a Relation Relations are sets of tuples, not lists of tuples (See Fig. 2.4)

2.2.6 Relation Instances A relation about movies is not static; rather, relations change

over time. It is less common for the schema of a relation to change. The current instance

Page 13: The Relational Model of Data Prof. Yin-Fu Huang CSIE, NYUST Chapter 2.

Database Systems Yin-Fu Huang

2.22.2 Basics of the Relational ModelBasics of the Relational Model

Page 14: The Relational Model of Data Prof. Yin-Fu Huang CSIE, NYUST Chapter 2.

Database Systems Yin-Fu Huang

2.22.2 Basics of the Relational ModelBasics of the Relational Model

2.2.7 Keys of Relations A set of attributes forms a key for a relation if we do not allow

two tuples in a relation instance to have the same values in all the attributes of the key.

2.2.8 An Example Database Schema (See Fig. 2.5)

Page 15: The Relational Model of Data Prof. Yin-Fu Huang CSIE, NYUST Chapter 2.

Database Systems Yin-Fu Huang

2.22.2 Basics of the Relational ModelBasics of the Relational Model

Page 16: The Relational Model of Data Prof. Yin-Fu Huang CSIE, NYUST Chapter 2.

Database Systems Yin-Fu Huang

2.32.3 Defining a Relation Schema in SQL Defining a Relation Schema in SQL

Current standard for SQL: SQL-991) Data-definition sublanguage2) Data-manipulation sublanguage

2.3.1 Relations in SQL Stored relations called tables Views defined by a computation Temporary tables

Page 17: The Relational Model of Data Prof. Yin-Fu Huang CSIE, NYUST Chapter 2.

Database Systems Yin-Fu Huang

2.32.3 Defining a Relation Schema in SQL Defining a Relation Schema in SQL

2.3.2 Data Types CHAR(n), VARCHAR(n) BIT(n), BIT VARYING(n) BOOLEAN INT or INTEGER, SHORTINT FLOAT or REAL, DOUBLE PRECISION, DECIMAL(n,d) DATE, TIME

Page 18: The Relational Model of Data Prof. Yin-Fu Huang CSIE, NYUST Chapter 2.

Database Systems Yin-Fu Huang

2.32.3 Defining a Relation Schema in SQL Defining a Relation Schema in SQL

2.3.3 Simple Table Declarations CREATE TABLE

(See Fig. 2.7 and Fig. 2.8)

2.3.4 Modifying Relation Schemas DROP TABLE ALTER TABLE

1) ADD followed by an attribute name and its data typeALTER TABLE MovieStar ADD phone CHAR(16);

2) DROP followed by an attribute nameALTER TABLE MovieStar DROP birthdate;

Page 19: The Relational Model of Data Prof. Yin-Fu Huang CSIE, NYUST Chapter 2.

Database Systems Yin-Fu Huang

2.32.3 Defining a Relation Schema in SQL Defining a Relation Schema in SQL

Page 20: The Relational Model of Data Prof. Yin-Fu Huang CSIE, NYUST Chapter 2.

Database Systems Yin-Fu Huang

2.32.3 Defining a Relation Schema in SQL Defining a Relation Schema in SQL

Page 21: The Relational Model of Data Prof. Yin-Fu Huang CSIE, NYUST Chapter 2.

Database Systems Yin-Fu Huang

2.32.3 Defining a Relation Schema in SQL Defining a Relation Schema in SQL

2.3.5 Default Values gender CHAR(1) DEFAULT ‘?’,

birthdate DATE DEFAULT DATE ‘0000-00-00’ ALTER TABLE MovieStar ADD phone CHAR(16)

DEFAULT ‘unlisted’;

2.3.6 Declaring Keys We may declare one attribute to be a key when that attribute is

listed in the relation schema. We may add to the list of items declared in the schema an

additional declaration that says a particular attribute or set of attributes forms the key.

Page 22: The Relational Model of Data Prof. Yin-Fu Huang CSIE, NYUST Chapter 2.

Database Systems Yin-Fu Huang

2.32.3 Defining a Relation Schema in SQL Defining a Relation Schema in SQL

Two declarations:1) PRIMARY KEY2) UNIQUE

Two tuples in R cannot agree on all of the attributes in set S (i.e., key), unless one of them is NULL.

If PRIMARY KEY is used, then attributes in S are not allowed to have NULL as a value for their components.(See Fig. 2.9, Fig. 2.10, and Fig. 2.11)

Page 23: The Relational Model of Data Prof. Yin-Fu Huang CSIE, NYUST Chapter 2.

Database Systems Yin-Fu Huang

2.32.3 Defining a Relation Schema in SQL Defining a Relation Schema in SQL

Page 24: The Relational Model of Data Prof. Yin-Fu Huang CSIE, NYUST Chapter 2.

Database Systems Yin-Fu Huang

2.32.3 Defining a Relation Schema in SQL Defining a Relation Schema in SQL

Page 25: The Relational Model of Data Prof. Yin-Fu Huang CSIE, NYUST Chapter 2.

Database Systems Yin-Fu Huang

2.32.3 Defining a Relation Schema in SQL Defining a Relation Schema in SQL

Page 26: The Relational Model of Data Prof. Yin-Fu Huang CSIE, NYUST Chapter 2.

Database Systems Yin-Fu Huang

2.42.4 An Algebraic Query LanguageAn Algebraic Query Language

Relational algebra

2.4.1 Why Do We Need a Special Query Language? Relational algebra is useful because it is less powerful than C

or Java. Two huge rewards:

1) Ease of programming2) Producing highly optimized code

Page 27: The Relational Model of Data Prof. Yin-Fu Huang CSIE, NYUST Chapter 2.

Database Systems Yin-Fu Huang

2.42.4 An Algebraic Query LanguageAn Algebraic Query Language

2.4.2 What Is an Algebra? Atomic operands:

1) Variables that stand for relations2) Constants, which are finite relations

2.4.3 Overview of Relational Algebra Operations: Four broad classes

1) The usual set operations – union, intersection, and difference2) Operations that remove parts of a relation – selection and projection

Page 28: The Relational Model of Data Prof. Yin-Fu Huang CSIE, NYUST Chapter 2.

Database Systems Yin-Fu Huang

2.42.4 An Algebraic Query LanguageAn Algebraic Query Language

3) Operations that combine the tuples of two relations – Cartesian product and join

4) An operation called renaming We generally shall refer to expressions of relational algebra as queries.

2.4.4 Set Operations on Relations R S, R∩S, R-S∪ Some conditions on R and S:

1) Schemas with identical sets of attributes, and the types (domains) for each attribute must be the same

Page 29: The Relational Model of Data Prof. Yin-Fu Huang CSIE, NYUST Chapter 2.

Database Systems Yin-Fu Huang

2.42.4 An Algebraic Query LanguageAn Algebraic Query Language

2) The order of attributes is the same for both relations. Different names renaming operator⇒ (See Fig. 2.12)

2.4.5 Projection πA1, A2, …, An(R) (See Fig. 2.13)

πtitle, year, length(Movies), πgenre(Movies)

2.4.6 Selection σC(R)

σlength 100≧ (Movies), σlength 100 AND studioName=’Fox’≧ (Movies)

Page 30: The Relational Model of Data Prof. Yin-Fu Huang CSIE, NYUST Chapter 2.

Database Systems Yin-Fu Huang

2.42.4 An Algebraic Query LanguageAn Algebraic Query Language

Page 31: The Relational Model of Data Prof. Yin-Fu Huang CSIE, NYUST Chapter 2.

Database Systems Yin-Fu Huang

2.42.4 An Algebraic Query LanguageAn Algebraic Query Language

Page 32: The Relational Model of Data Prof. Yin-Fu Huang CSIE, NYUST Chapter 2.

Database Systems Yin-Fu Huang

2.42.4 An Algebraic Query LanguageAn Algebraic Query Language

2.4.7 Cartesian Product R×S The components from R precede the components from S in the

attribute order for the result. If R and S should happen to have some attributes in common,

then we need to invent new names for at least one of each pair of identical attributes.

(See Fig. 2.14)

2.4.8 Natural Joins R∞S (See Fig. 2.15)

Page 33: The Relational Model of Data Prof. Yin-Fu Huang CSIE, NYUST Chapter 2.

Database Systems Yin-Fu Huang

2.42.4 An Algebraic Query LanguageAn Algebraic Query Language

Page 34: The Relational Model of Data Prof. Yin-Fu Huang CSIE, NYUST Chapter 2.

Database Systems Yin-Fu Huang

2.42.4 An Algebraic Query LanguageAn Algebraic Query Language

Page 35: The Relational Model of Data Prof. Yin-Fu Huang CSIE, NYUST Chapter 2.

Database Systems Yin-Fu Huang

2.42.4 An Algebraic Query LanguageAn Algebraic Query Language

Joined tuple vs. dangling tuple (See Fig. 2.16)

2.4.9 Theta-Joins R∞CS Constructed as follows:

1) Take the product of R and S.2) Select from the product only those tuples that satisfy the condition C.

U∞A<DV (See Fig. 2.17) U∞A<D AND U.B≠V.BV

Page 36: The Relational Model of Data Prof. Yin-Fu Huang CSIE, NYUST Chapter 2.

Database Systems Yin-Fu Huang

2.42.4 An Algebraic Query LanguageAn Algebraic Query Language

Page 37: The Relational Model of Data Prof. Yin-Fu Huang CSIE, NYUST Chapter 2.

Database Systems Yin-Fu Huang

2.42.4 An Algebraic Query LanguageAn Algebraic Query Language

Page 38: The Relational Model of Data Prof. Yin-Fu Huang CSIE, NYUST Chapter 2.

Database Systems Yin-Fu Huang

2.42.4 An Algebraic Query LanguageAn Algebraic Query Language

2.4.10 Combining Operations to Form Queries One can construct expressions of relational algebra by

applying operators to subexpressions, using parentheses when necessary to indicate grouping of operands.

“What are the titles and years of movies made by Fox that are at least 100 minutes long?”

Four steps:(See Fig. 2.18)

πtitle, year(σlength 100≧ (Movies)∩σstudioName=’Fox’(Movies)) πtitle, year(σlength 100 AND studioName=’Fox’≧ (Movies))

Page 39: The Relational Model of Data Prof. Yin-Fu Huang CSIE, NYUST Chapter 2.

Database Systems Yin-Fu Huang

2.42.4 An Algebraic Query LanguageAn Algebraic Query Language

Page 40: The Relational Model of Data Prof. Yin-Fu Huang CSIE, NYUST Chapter 2.

Database Systems Yin-Fu Huang

2.42.4 An Algebraic Query LanguageAn Algebraic Query Language

2.4.11 Naming and Renaming ρS(A1, A2, …, An)(R) R×ρS(X, C, D)(S), ρRS(A, B, X, C, D)(R×S)

(See Fig. 2.19)

2.4.12 Relationships among Operations Some of the operations can be expressed in terms of other

relational-algebra operations. R∩S=R-(R-S) R∞CS=σC(R×S)

Page 41: The Relational Model of Data Prof. Yin-Fu Huang CSIE, NYUST Chapter 2.

Database Systems Yin-Fu Huang

2.42.4 An Algebraic Query LanguageAn Algebraic Query Language

Page 42: The Relational Model of Data Prof. Yin-Fu Huang CSIE, NYUST Chapter 2.

Database Systems Yin-Fu Huang

2.42.4 An Algebraic Query LanguageAn Algebraic Query Language

R∞S=πL(σC(R×S))1) C: R.A1=S.A1 AND R.A2=S.A2 AND … AND R.An=S.An2) L: Attributes of R followed by those attributes in S that are not in R.

U∞V ⇒ πA, U.B, U.C, D(σU.B=V.B AND U.C=V.C(U×V)) U∞A<D AND U.B≠V.BV ⇒ σA<D AND U.B≠V.B(U×V) The six remaining operations – union, difference, selection,

projection, product, and renaming – form an independent set.

Page 43: The Relational Model of Data Prof. Yin-Fu Huang CSIE, NYUST Chapter 2.

Database Systems Yin-Fu Huang

2.42.4 An Algebraic Query LanguageAn Algebraic Query Language

2.4.13 A Linear Notation for Algebraic Expressions The notation:

1) A relation name and parenthesized list of attributes for that relation

2) The assignment symbol :=3) Any algebraic expression on the right

R(t, y, l, i, s, p):=σlength 100≧ (Movies)S(t, y, l, i, s, p):=σstudioName=’Fox’(Movies)T(t, y, l, i, s, p):=R∩SAnswer(title, year):=πt, y(T)

Page 44: The Relational Model of Data Prof. Yin-Fu Huang CSIE, NYUST Chapter 2.

Database Systems Yin-Fu Huang

2.52.5 Constraints on RelationsConstraints on Relations

Many kinds of constraints can be expressed in relational algebra.

2.5.1 Relational Algebra as a Constraint Language Two ways:

1) Equal-to-the-emptyset; e.g., R = 2) Set-containment; e.g., R S⊆

2.5.2 Referential Integrity Constraints Movies(title, year, length, genre, studioName, producerC#)

MovieExec(name, address, cert#, netWorth)πproducerC#(Movies) ⊆ πcertC#(MovieExec)

Page 45: The Relational Model of Data Prof. Yin-Fu Huang CSIE, NYUST Chapter 2.

Database Systems Yin-Fu Huang

2.52.5 Constraints on RelationsConstraints on Relations

StarsIn(movieTitle, movieYear, starName)Movies(title, year, length, genre, studioName, producerC#)

πmovieTitle, movieYear(StarsIn) ⊆ πtitle, year(Movies)

2.5.3 Key Constraints If two tuples agree on name, then they must also agree on

address.σ MS1.name=MS2.name AND MS1.address≠MS2.address (MS1×MS2) =

2.5.4 Additional Constraint Examples Domain constraint

σgender≠’F’ AND gender≠’M’(MovieStar) =

Page 46: The Relational Model of Data Prof. Yin-Fu Huang CSIE, NYUST Chapter 2.

Database Systems Yin-Fu Huang

2.52.5 Constraints on RelationsConstraints on Relations

MovieExec(name, address, cert#, netWorth)Studio(name, address, presC#)

σnetWorth<10000000(Studio∞presC#=cert#MovieExec) =

πpresC#(Studio) ⊆ πcert#(σnetWorth 10000000≧ (MovieExec))

Page 47: The Relational Model of Data Prof. Yin-Fu Huang CSIE, NYUST Chapter 2.

Database Systems Yin-Fu Huang

The End.