Relational Model (CB Chapter 4) CPSC 356 Database Ellen Walker Hiram College.

22
Relational Model (CB Chapter 4) CPSC 356 Database Ellen Walker Hiram College

Transcript of Relational Model (CB Chapter 4) CPSC 356 Database Ellen Walker Hiram College.

Page 1: Relational Model (CB Chapter 4) CPSC 356 Database Ellen Walker Hiram College.

Relational Model (CB Chapter 4)

CPSC 356 Database

Ellen Walker

Hiram College

Page 2: Relational Model (CB Chapter 4) CPSC 356 Database Ellen Walker Hiram College.

Relational Model

• Proposed by E.F. Codd, 1970• Goals

– Data independence (physical vs. conceptual)– Normalization (avoiding redundancy)– Set-oriented data manipulation language

(relational algebra)

• Examples of Relational DBMS: – DBMS System R (IBM), Ingres (UC Berkeley)– Oracle (all versions), Access, RDB, DB2 …

Page 3: Relational Model (CB Chapter 4) CPSC 356 Database Ellen Walker Hiram College.

Terminology

• A relation (instance) is a table with columns and rows.– (don’t confuse with relationships in ER models)

• An attribute is a named column of a table.– (corresponds to simple attribute in ER)

• A domain is the set of allowable values for an attribute

• A tuple is a row of a table.

Page 4: Relational Model (CB Chapter 4) CPSC 356 Database Ellen Walker Hiram College.

More Terminology

• The degree or arity of a relation is its number of attributes (columns)

• The cardinality of a relation is its number of tuples (rows)

• A relational database is a collection of normalized relations with distinct names.– We’ll get to normalization later!

Page 5: Relational Model (CB Chapter 4) CPSC 356 Database Ellen Walker Hiram College.

Example Relation (Phone Book)

• Attributes: First, last, dept, email, etc.• Domain for Title: {Prof, Inst, Assoc, Asst}• Tuple: (Obie,Slotterbeck,CS,Obie,5275,Prof)• Degree = 6, Cardinality = 4

First Last Dept Email Phone Title

Obie Slotterbeck CS Obie 5275 Prof

Irina Lomonosov CS LomonosovI 5002 Inst

Ellen Walker CS WalkerEL 5250 Prof

Angela Guercio CS GuercioA 6048 Asst

Page 6: Relational Model (CB Chapter 4) CPSC 356 Database Ellen Walker Hiram College.

Order Doesn’t Matter!

• This is the same phone book relation.

Email Title First Phone Last Dept

LomonosovI Inst Irina 5002 Lomonosov CS

Obie Prof Obie 5275 Slotterbeck CS

WalkerEL Prof Ellen 5250 Walker CS

GuercioA Asst Angela 5048 Guercio CS

Page 7: Relational Model (CB Chapter 4) CPSC 356 Database Ellen Walker Hiram College.

Sets and Subsets

• A set is an unordered collection of unique elementsSet S = {1,2,3} “1 is an element of S”

{a,b,c} = {a,c,b}

• A subset of a set is another set whose elements all come from the original set.{a,b} is a subset of {a,c,b}

{1,2,3} is a subset of {1,2,3}

{1,2,4} is not a subset of {1,2,3}

{} (the empty set) is a subset of every set!

Page 8: Relational Model (CB Chapter 4) CPSC 356 Database Ellen Walker Hiram College.

Mathematical Relations

• Cartesian product: a set of ordered pairs, where each contains one element from each original set{1,2,3} x {a, b} =

{(1,a), (1,b), (2,a), (2,b), (3,a), (3,b)}

• Relation: any subset of a Cartesian product– R1 = {(1,a),(2,b),(3,a)}– R2 = {(1,a), (1,b), (2,b), (3,b)}– R3 = { }

Page 9: Relational Model (CB Chapter 4) CPSC 356 Database Ellen Walker Hiram College.

Bigger Relations

• First, take the cross product of n sets– If n = 3, get triples; if n=4, get quadruples– For arbitrary n, get “n-tuples”

• Now, take a subset of that big cross product– Several n-tuples

• Arrange them in rows, so the commas line up– Looks familiar…

• Domains of attributes are sets from the cross product.

Page 10: Relational Model (CB Chapter 4) CPSC 356 Database Ellen Walker Hiram College.

Database Schemas

• A schema describes a database, but not the data

• Relation schema: set of attribute and domain name pairs

(First: char-string) (Last: char-string)(Dept: {CS,Math,Comm,Econ,Mgmt, etc.})(Phone: 4-digit-string) (email: char-string)(Title: {Inst,Asst,Assoc,Prof})

• Relational Database Schema: set of relational schemas, each with a distinct name and integrity constraints (later)

Page 11: Relational Model (CB Chapter 4) CPSC 356 Database Ellen Walker Hiram College.

Properties of (Database) Relations

• Every relation has a unique name• Every attribute has a unique name • Attribute’s values are all from its domain• Because relations are sets…

– No duplicate tuples– Order of tuples doesn’t matter (theoretically)

• Every cell contains a single value (single-valued attributes)

Page 12: Relational Model (CB Chapter 4) CPSC 356 Database Ellen Walker Hiram College.

Single-valued Attributes

• “Oldest-child” is an ok attribute• “Children” is not, because it is likely to be

multi-valued• Break up composite attributes into multiple

attributes– Name --> First-name, Middle-init, Last-name

• Use “relationships” in the ER sense for multivalued attributes.

Page 13: Relational Model (CB Chapter 4) CPSC 356 Database Ellen Walker Hiram College.

Keys

• Superkey: set of attributes that uniquely identifies a tuple. – (The whole tuple is always a superkey - why?)

• Candidate Key: a superkey that doesn’t have a proper subset that is also a superkey

• Primary Key: the candidate key that is selected by the designer (often generated)

• Foreign Key: Set of attributes in one relation that matches the Primary key of another.

Page 14: Relational Model (CB Chapter 4) CPSC 356 Database Ellen Walker Hiram College.

Properties of a Candidate Key

• Uniqueness– The set of attribute values in the candidate key

uniquely identifies a tuple

• Irreducibility– If one attribute is removed from the candidate key,

then uniqueness no longer holds

Page 15: Relational Model (CB Chapter 4) CPSC 356 Database Ellen Walker Hiram College.

NULL

• NULL is considered a member of every domain

• NULL means:– Value is currently unknown– Value is not applicable for this entry (e.g. “spouse”

of a single person)

• Don’t confuse NULL with– Zero– Blank or empty string

Page 16: Relational Model (CB Chapter 4) CPSC 356 Database Ellen Walker Hiram College.

Relational Schema “Shorthand”

• List the relation name first• Attribute names in parentheses following the

relation name• Primary key attribute(s) underlined• Example

– Branch (BranchNo, street, city, postcode)– Staff (StaffNo, fName, position, sex, DOB, salary,

BranchNo)

• Arrow from foreign key to its referent

Page 17: Relational Model (CB Chapter 4) CPSC 356 Database Ellen Walker Hiram College.

Integrity Constraints

• Rules that apply to all instances of any database

• If these rules are violated, the instance is not a legal database (it doesn’t “make sense”)

• Some come from business rules of enterprise– “Students can take only 1 course in 3-week term”

• Others from the schema design– “InstructorID cannot be NULL and must match the

ID of a listed faculty member”

Page 18: Relational Model (CB Chapter 4) CPSC 356 Database Ellen Walker Hiram College.

Key Constraint

• Specifies which attributes compose a candidate key (or primary key) for the relation

• Example:– {first, last} and {email} are both keys of the Phone

Book relation

• Generally, only the primary key is specified• In practice, an auto-generated id number is

used as the key

Page 19: Relational Model (CB Chapter 4) CPSC 356 Database Ellen Walker Hiram College.

Basic Integrity Constraints

– Entity Integrity: No attribute of a primary key can be NULL for any tuple.

• This constraint is implicit in the KEY constraint in SQL

– Referential Integrity: If a relation has a foreign key, either the foreign key is NULL, or the foreign key matches the primary key of an existing tuple in another relation.

• Attribute names do not have to be the same

Page 20: Relational Model (CB Chapter 4) CPSC 356 Database Ellen Walker Hiram College.

Maintaining Referential Integrity

• Adding a new tuple– Add with NULL foreign keys – Add tuple in the other relation (with primary key matching the

foreign key) first

• Deleting a tuple– Find all tuples in other relations whose foreign keys match

this one– Either delete those tuples, or change their foreign keys to

NULL

• Modifying a tuple– If a foreign key is changed, make sure that the new value

points to an existing tuple or is NULL

Page 21: Relational Model (CB Chapter 4) CPSC 356 Database Ellen Walker Hiram College.

Summary of Constraints

• Domain constraint– Each attribute’s values come from its domain

• Key constraint– Every relation has a primary key

• Entity integrity constraint– No NULL primary key attributes

• Referential integrity constraint– Foreign keys are NULL or refer to existing tuples’ primary

keys

• Semantic constraints– Any additional constraints that must be satisfied based on

the client’s needs

Page 22: Relational Model (CB Chapter 4) CPSC 356 Database Ellen Walker Hiram College.

Views

• “Relations” that appear to exist, but actually combine attributes from multiple base (implemented) relations.

• When a base table changes, the view changes & vice versa

• Useful for– Security: limit which attributes a user sees– Customization– Simplification