THE RELATIONAL MODEL I IST 210: Organization of Data IST210 1.

32
THE RELATIONAL MODEL I IST 210: Organization of Data IST210 1

Transcript of THE RELATIONAL MODEL I IST 210: Organization of Data IST210 1.

Page 1: THE RELATIONAL MODEL I IST 210: Organization of Data IST210 1.

IST210 1

THE RELATIONAL MODEL IIST 210: Organization of Data

Page 2: THE RELATIONAL MODEL I IST 210: Organization of Data IST210 1.

IST210 2

Chapter Objectives• Learn the concept of the relational model

• Understand how relations differ from non-relational tables• Learn basic relational terminology

• Learn the meaning and importance of keys, foreign keys, and related terminology • Understand how foreign keys represent relationships• Learn the purpose and use of surrogate keys

• Learn the meaning of functional dependencies• Learn to apply a process for normalizing relations

Page 3: THE RELATIONAL MODEL I IST 210: Organization of Data IST210 1.

IST210 3

Characteristics of a Relation• A relation is a two-dimensional table (row and column) that has

specific characteristics• Columns contain data about attributes of the entity

• Each column has a unique name• All entries in a column are the same kind• The order of the columns is unimportant

• Rows contain data about entity instances• Cells of the table hold a single value• No two rows may be identical• The order of the rows is unimportant

StudentID FirstName LastName DOB

9123450 John Smith Jan. 1, 1989

9123451 John Adam Jun. 1, 1988

9123452 Jane Adam Aug, 1,1989

9123453 Josh Cohen Aug. 1,1989

Page 4: THE RELATIONAL MODEL I IST 210: Organization of Data IST210 1.

4

Presenting Relation Structure

IST210

Column 1 Column 2 … Column n RELATION_NAME(Column1, Column 2, …, Column n)

RELATION_NAME

STUDENT(StudentID, FirstName, LastName, DOB)

STUDENT

Original Table Relation Representation

From now on, we will frequently use this representation for relations

StudentID FirstName LastName DOB

9123450 John Smith Jan. 1, 1989

9123451 John Adam Jun. 1, 1988

9123452 Jane Adam Aug, 1,1989

9123453 Josh Cohen Aug. 1,1989

Page 5: THE RELATIONAL MODEL I IST 210: Organization of Data IST210 1.

A Sample Relation

EmployeeNumber FirstName LastName100 Mary Abernathy101 Jerry Cadley104 Alex Copley107 Megan Jackson

Page 6: THE RELATIONAL MODEL I IST 210: Organization of Data IST210 1.

Relation or Non-Relation?

EmployeeNumber Phone LastName

100 335-6421,454-9744

Abernathy

101 215-7789 Cadley

104 610-9850 Copley

107 299-9090 Jackson

Non-relation: Cells of the table hold multiple values

Page 7: THE RELATIONAL MODEL I IST 210: Organization of Data IST210 1.

Relation or Non-Relation?

EmployeeNumber Phone LastName

100 335-6421 Abernathy

101 215-7789 Cadley

104 610-9850 Copley

100 335-6421 Abernathy

107 299-9090 Jackson

Non-Relation: No two rows may be identical

Page 8: THE RELATIONAL MODEL I IST 210: Organization of Data IST210 1.

Terminology

IST210

Synonyms…

Table Row Column

File Record Field

Relation Tuple Attribute

Page 9: THE RELATIONAL MODEL I IST 210: Organization of Data IST210 1.

IST210 9

Key

• A (unique) key is one (or more) columns of a relation that is (are) used to uniquely identify a row• A composite key is a key that contains two or more attributes

Page 10: THE RELATIONAL MODEL I IST 210: Organization of Data IST210 1.

IST210 10

Example: KeyStudentID FirstName LastName DOB

9123450 John Smith Jan. 1, 1989

9123451 John Adam Jun. 1, 1988

9123452 Jane Adam Aug, 1,1989

9123453 Josh Cohen Aug. 1,1989

What attribute(s) form a key?

• StudentID• FirstName • (FirstName, LastName)• (FirstName, DOB)• (StudentID, FirstName)• (StudentID, FirstName, LastName, DOB)

Page 11: THE RELATIONAL MODEL I IST 210: Organization of Data IST210 1.

IST210 11

Example: KeyStudentID FirstName LastName DOB

9123450 John Smith Jan. 1, 1989

9123451 John Adam Jun. 1, 1988

9123452 Jane Adam Aug, 1,1989

9123453 Josh Cohen Aug. 1,1989

• StudentID: • yes

• FirstName: • no

• (FirstName, LastName): • yes (in this table), but no (if there are thousands of records, there could be

students with same first name and last name)• (FirstName, DOB):

• yes (in this table), but no (if more records)• (StudentID, FirstName): • (StudentID, FirstName, LastName, DOB):

• yes, but …

Page 12: THE RELATIONAL MODEL I IST 210: Organization of Data IST210 1.

IST210 12

Candidate Key• A candidate key is called “candidate” because it is a candidate to become the primary key• A special key• If the subset of a key is also a key, we don’t usually

consider it as a candidateStudentID FirstName LastName DOB

9123450 John Smith Jan. 1, 1989

9123451 John Adam Jun. 1, 1988

9123452 Jane Adam Aug, 1,1989

9123453 Josh Cohen Aug. 1,1989

What attribute(s) form a key?

StudentID: yesFirstName: no(FirstName, LastName): yes (in this table), but no (if there are thousands of records, there could be students with same first name and last name)(FirstName, DOB): yes (in this table), but no (if more records)(StudentID, FirstName): yes, but not a candidate key(StudentID, FirstName, LastName, DOB): yes, but not a candidate key

Page 13: THE RELATIONAL MODEL I IST 210: Organization of Data IST210 1.

IST210 13

Primary Key• A primary key is a candidate key chosen to be the main

key for the relation• A relation can only have one primary key• Each candidate key could be chosen as a primary key, but we

usually have preferences

StudentID FirstName LastName DOB

9123450 John Smith Jan. 1, 1989

9123451 John Adam Jun. 1, 1988

9123452 Jane Adam Aug, 1,1989

9123453 Josh Cohen Aug. 1,1989

Primary key: StudentID

Page 14: THE RELATIONAL MODEL I IST 210: Organization of Data IST210 1.

IST210 14

Primary Key: Discussion

Even if HomeAddress could be a candidate key, we still prefer choosing StudentID as the primary key. Because (1) HomeAddress might have duplicate(2) HomeAddress is a string, hard to index and query. StudentID is

numeric value

STUDENT(StudentID, FirstName, LastName, DOB, SSN)Candidate key: SSN? Good to be a primary key?

STUDENT(StudentID, FirstName, LastName, DOB, HomeAddress)Candidate key: HomeAddress? Good to be a primary key?

Even if SSN is a candidate key, we still prefer choosing StudentID as the primary key. Because SSN is sensitive information

Page 15: THE RELATIONAL MODEL I IST 210: Organization of Data IST210 1.

IST210 15

Presenting Primary Key

• Non-Composite KeyRELATION_NAME(Column1, Column 2, …, Column n)

• Student(StudentID, FirstName, LastName, DOB)• The underline of StudentID indicates StudentID is the

primary key of this relation

• Composite KeyRELATION_NAME(Column1, Column 2, …, Column n)

• Student(StudentID, FirstName, LastName, DOB)• The underline of FirstName and LastName indicates

(FirstName, LastName) is the composite primary key of this relation

Page 16: THE RELATIONAL MODEL I IST 210: Organization of Data IST210 1.

IST210 16

How to Choose a Primary Key?CustomerName HomeAddress Email

John Smith 293 Main St [email protected]

John Adam 10 Green Rd [email protected]

Jane White 111 University [email protected]

Josh Cohen 12 Beaver [email protected]

What if none of existing attributes is appropriate?Answer: artificially create a new attribute

Candidate keys: HomeAddress? Email?Primary key?

Page 17: THE RELATIONAL MODEL I IST 210: Organization of Data IST210 1.

IST210 17

A Surrogate Key• A Surrogate Key is a unique numeric value that is added

to a relation to serve as the primary key• System generated• Contains no semantic meaning

• Surrogate key is very commonly used. A surrogate key is often used to replace a composite primary key or a non-numeric primary key• (FirstName, LastName, DOB) StudentID• HomeAddress CustomerID

Page 18: THE RELATIONAL MODEL I IST 210: Organization of Data IST210 1.

IST210 18

Surrogate Key Examples• Penn State database

• StudentID

• Membership database• Membership ID

• Online shopping• Order number

Page 19: THE RELATIONAL MODEL I IST 210: Organization of Data IST210 1.

IST210 19

Review

• Key: StudentID, (StudentID, FirstName), …• Candidate key: StudentID• Primary key: (StudentID, FirstName, LastName, DOB)• Surrogate key: StudentID

StudentID FirstName LastName DOB

9123450 John Smith Jan. 1, 1989

9123451 John Adam Jun. 1, 1988

9123452 Jane Adam Aug. 1,1989

9123453 Josh Cohen Aug. 1,1989

Page 20: THE RELATIONAL MODEL I IST 210: Organization of Data IST210 1.

Relationships Between Tables• A table is related to other tables

• Shared columns in Chapter 1

IST210 20

StudentID FirstName LastName DOB

9123450 John Smith Jan. 1, 1989

9123451 John Adam Jun. 1, 1988

9123452 Jane Adam Aug. 1,1989

9123453 Josh Cohen Aug. 1,1989

ClubID ClubName PresidentStudentID

12 Football 9123450

13 Medical 9123453

15 Dance 9123452

Primary KeyStudentID is the primary key in STUDENT table

Foreign KeyPresentStudentID is the foreign key in CLUB table

CLUB(ClubID, ClubName, PresidentStudentID)STUDENT(StudentID, FirstName, LastName, DOB)

Page 21: THE RELATIONAL MODEL I IST 210: Organization of Data IST210 1.

IST210 21

A Foreign Key• To preserve relationships of relations, it is needed to

create a foreign key• A foreign key is a primary key from one table placed into

another table• Why?

• The key is called a foreign key in the relation that receives the key

• Presenting a foreign key• Attributes name in italic• RELATION_NAME(Column1, Column 2, …, Column n)

Page 22: THE RELATIONAL MODEL I IST 210: Organization of Data IST210 1.

IST210 22

Foreign Key Example

Foreign Key Primary Key

PROJECT(ProjID, ProjName, MgrID)MANAGER(MgrID, MgrName, Office)

ProjID ProjName MgrID

PROJECT

MgrID MgrName Office

MANAGER

Page 23: THE RELATIONAL MODEL I IST 210: Organization of Data IST210 1.

IST210 23

Foreign Key Example

Foreign Key

Primary Key

DEPARTMENT(DeptID, DeptName, Location)EMPLOYEE(EmpID, DeptID, EmpName)

DeptID DeptName

Location

DEPARTMENT

EmpID DeptID EmpName

EMPLOYEE

Page 24: THE RELATIONAL MODEL I IST 210: Organization of Data IST210 1.

IST210 24

Foreign Key ExampleSTUDENT table COURSE table

REGISTRATION table

STUDENT(StudentID, Name, Department, Email)COURSE(CourseID, Instructor, CourseName, Location)REGISTRATION(StudentID, CourseID)

An attribute can be both part of primary key and foreign key!

Page 25: THE RELATIONAL MODEL I IST 210: Organization of Data IST210 1.

IST210 25

Referential Integrity• Every value of a foreign key must match a value of an

existing primary key

Student ID CourseID1 2105 2102 2103 2101 2203 220

10 23010 250 250 does not exist in COURSE table!

Violate referential integrity!

Page 26: THE RELATIONAL MODEL I IST 210: Organization of Data IST210 1.

IST210 26

Summary of Keys• Key is one (or more) columns of a relation that is (are)

used to identify a row• Unique key• Single key and composite key

• A unique key (can be) a candidate key (may be chosen to be) a primary key

• A surrogate key: an intentionally created attribute to serve as a primary key

• A foreign key: link to the primary key in another table

Page 27: THE RELATIONAL MODEL I IST 210: Organization of Data IST210 1.

IST210 27

Review Quiz• Q1. Candidate keys could be a non-unique key? • Q2. Surrogate key values have no semantic meaning to

the users? • Q3. A surrogate key can NOT be chosen as a primary

key? • Q4. A foreign key in one table must be a primary key in

another table?

Page 28: THE RELATIONAL MODEL I IST 210: Organization of Data IST210 1.

IST210 28

Review Quiz• Suppose we have two tables:

• BOOK(BookID, Title, PublisherID)• PUBLISHER(PublisherID, Name, Location)

• Q1. Title is a key in BOOK?• Q2. (BookID, Title) is a key?• Q3. (BookID, Title) is a candidate key?• Q4. PublisherID is a foreign key in PUBLISHER?

Page 29: THE RELATIONAL MODEL I IST 210: Organization of Data IST210 1.

IST210 29

Review Quiz• Suppose we have two tables:

• BOOK(BookID, Title, PublisherID)• PUBLISHER(PublisherID, Name, Location)

• Q5. Is following design about primary key and foreign key correct?• BOOK(BookID, Title, PublishedYear, PublisherID)• PUBLISHER(PublisherID, Name, Location)

Page 30: THE RELATIONAL MODEL I IST 210: Organization of Data IST210 1.

IST210 30

Review Quiz• Suppose we have two tables:

• BOOK(BookID, Title, PublisherID)• PUBLISHER(PublisherID, Name, Location)

• Q6. Is following design about primary key and foreign correct?• BOOK(BookID, Title, PublishedYear, PublisherID)• PUBLISHER(PublisherID, Name, Location)

Page 31: THE RELATIONAL MODEL I IST 210: Organization of Data IST210 1.

IST210 31

Review Quiz• Suppose we have two tables:

• BOOK(BookID, Title, PublisherID)• PUBLISHER(PublisherID, Name, Location)

• Q7. Is following design about primary key and foreign correct?• BOOK(BookID, Title, PublishedYear, PublisherID)• PUBLISHER(PublisherID, Name, Location)

Page 32: THE RELATIONAL MODEL I IST 210: Organization of Data IST210 1.

IST210 32

Reminder• Homework 1 due tonight 11:59PM• Homework P1 due Wed night 11:59PM