Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P....

155
Matthew P. Johnson, OCL5, CISDD CUNY, S ept 2005 1 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005

Transcript of Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P....

Page 1: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

1

OCL4 Oracle 10g:SQL & PL/SQLSession #1

Matthew P. Johnson

CISDD, CUNY

June, 2005

Page 2: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

2

Personnel Instructor: Matthew P. Johnson

mpjohnson-at-gmail.com

TA: Mohammed Ali ali_855-at-yahoo.com

Admin: Dawn Kleinberger dkleinberger-at-gc.cuny.edu

Page 3: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

3

Communications Web page:

http://pages.stern.nyu.edu/~mjohnson/oracle/

syllabus

course policies

reading assignments

etc.

Page 4: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

4

Acknowledgements Thanks to Ramesh at NYU, Ullman, et al.,

Raghu and Johannes, Dan Suciu, Arthur Keller, David Kuijt for course materials

See classpage for other related, antecedent DBMS courses

Page 5: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

5

What is a Database? A very large, integrated collection of data. Models real-world enterprise.

Entities students, courses, instructors, TAs

Relationships George is currently taking OCL Dick is currently teaching OCL Condi is currently TA-ing OCL but took it last semester

Database Management System (DBMS): large software package designed to store and manage databases

Page 6: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

6

Databases are everywhere: ordering a pizza Databases involved?1. Pizza Hut’s DB

stores previous orders by customer stores previous credit cards used

2. Credit card records huge databases of (attempted) purchases location, date, amount, parties

3. Got approved by credit-report companies4. phone company’s records

Local Usage Details (“Pull his LUDs, Lenny.”)

5. Caller ID ensures reported address matches destination

Page 7: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

7

Your wallet is full of DB records Driver’s license Credit cards Medical insurance card Social security card Gym membership Individual checks Dollar bills (w/serial numbers) Maybe even photos (ids on back)

Page 8: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

8

Databases are everywhere Q: Websites backed by DBMSs?

retail: Amazon, etc. data-mining: “Page You Made”

search engines: Google, etc. directories: Internic, etc. searchable DBs: IMDB, tvguide.com, etc.

Q: Non-web examples of DBMSs? airline bookings criminal/terrorist: TIA NYPD’s CompStat

all serious crime stats by precinct Retailers: Wal-Mart, etc.

when to re-order, purchase patterns, data-mining Genomics!

Page 9: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

9

Example of a Traditional DB AppSuppose we are building a system to store the

information about:

checking accounts savings accounts account holders state of each of each person’s accounts

Page 10: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

10

Can we do it without a DBMS?Sure we can! Start by storing the data in files:

checking.txt savings.txt customers.txt

Now write C or Java programs to implement specific tasks

Page 11: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

11

Doing it without a DBMS... Transfer $100 from George’s savings to

checking:

Read savings.txtFind&update the record “George”

balance -= 100

Write savings.txtRead checking.txtFind&update the record “George”

balance += 100

Write checking.txt

Read savings.txtFind&update the record “George”

balance -= 100

Write savings.txtRead checking.txtFind&update the record “George”

balance += 100

Write checking.txt

Write a C program to do the following:

Page 12: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

12

Problems without an DBMS...1. System crashes:

Q: What is the problem ? A: George lost his $100 Same problem even if reordered

2. Simultaneous access by many users George and Dick visit ATMs at same Lock checking.txt before each use–what is the problem?

Read savings.txtFind&update the rec “George.”Write savings.txtRead checking.txtFind&update the rec “George”Write checking.txt

Read savings.txtFind&update the rec “George.”Write savings.txtRead checking.txtFind&update the rec “George”Write checking.txt CRASH !

Page 13: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

13

Problems without an DBMS...

3.Large data sets (say 100s of GB or TBs) Why is this a problem?

No indices Finding “George” in huge flatfile is expensive

Modifications intractable without better data structures “George” “Georgie” is very expensive Deletions are very expensive

Page 14: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

14

Problems without a DBMS...

5.Security? File system may be insecure File system security may be coarse

6.Application programming interface (API)? suppose need other apps to access DB

7.How to interact with other DBMSs?

Page 15: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

15

General problems to solve In building our own system, many Qs arise:

how do we store the data? (file organization, etc.) how do we query the data? (write programs…) make sure that updates don’t mess things up?

leave the DB “consistent” provide different views on the data?

e.g., ATM user’s view v. bank teller’s view how do we deal with crashes?

Too hard! Go buy Oracle! Q: How does a DBMS solve these problems? A: Long story; see other courses/books

Page 16: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

16

Big issue: Transaction processing Grouping of several queries (or other

database actions) into one transaction ACID properties

Atomicity all or nothing

Consistency constraints on relationships

Isolation concurrency control Simulated solipsism

Durability Crash recovery

Page 17: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

17

Atomicity & Durability Saw how George lost $100 with makeshift

software A DBMS prevents this outcome

xacts are all or nothing One idea: Keep a log (history) of all actions in

set of xacts Durability: Use log to redo or undo certain ops

in crash recovery Atomicity: don’t really commit changes until

end Then, all at once

Page 18: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

18

Isolation Concurrent execution is essential for performance.

Frequent, slow disk accesses don’t waste CPU – keep running

Interleaving actions of different user programs can lead to inconsistency:

e.g., two programs simultaneously withdraw from the same account

DBMS ensures such problems don’t arise: users can pretend they are using a single-user system

Page 19: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

19

Isolation Contrast with a file in two Notepads

Strategy: ignore multiple users whichever saves last wins first save is overwritten

Contrast with a file in two Words Strategy: blunt isolation One can edit To the other it’s read-only

Page 20: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

20

Consistency Each xant (on a consistent DB) must leave it

in a consistent state can define integrity constraints checks the defined claims about the data remain

true

Page 21: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

21

Data Models Every DBMS uses some data model:

collection of concepts for describing data

Schema: description of partic set of data, using some data model

Relational data model: most widely used (by far) data model Oracle, DB2, SQLServer, other SQL DBMSs main concept: relation ~ table of rows & columns a rel’s schema defines its fields

Page 22: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

22

Example: university database Conceptual schema:

Students(ssn: string, name: string, login: string, age: int, gpa: real)

Courses(cid: string, cname: string, credits: int) Enrolled(sid:string, cid:string, grade: string)

Physical schema: Relations stored as unordered text files. Indices on first column of each rel

External Schema (View): Course_info(ssn: string, name: string) My_courses(cname: string, grade: string)

Page 23: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

23

How the programmer sees the DBMS Start with DDL to create tables:

Continue with DML to populate tables:

CREATE TABLE Students (Name CHAR(30),SSN CHAR(9) PRIMARY KEY NOT NULL,Category CHAR(20)

);

CREATE TABLE Students (Name CHAR(30),SSN CHAR(9) PRIMARY KEY NOT NULL,Category CHAR(20)

);

INSERT INTO StudentsVALUES('Howard', '123456789', 'undergraduate');INSERT INTO StudentsVALUES('Howard', '123456789', 'undergraduate');

Page 24: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

24

How the programmer sees the DBMS Tables:

Still implemented as files, but behind the scenes can be quite complex

SSN Name Category 123-45-6789 Howard undergrad 234-56-7890 Wesley grad … …

Students:

CID CName C20.0046 Databases C20.0056 Advanced Software

Courses:

“data independence” = separate logical view from physical implementation

SSN CID semester 123-45-6789 C20.0046 Spring,

2004 123-45-6789 C20.0056 Spring,

2004 234-56-7890 C20.0046 Fall, 2003 …

Takes:

Page 25: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

25

Querying: Structured Query Language Find all the students who have taken OCL2:

Find all the students who OCL2 last fall:

Find the students’ names:

Query processor does this efficiently

SELECT SSNFROM TakesWHERE CID='OCL2';

SELECT SSNFROM TakesWHERE CID='OCL2';

SELECT SSNFROM TakesWHERE CID='OCL2' AND Semester='Fall, 2003'

SELECT SSNFROM TakesWHERE CID='OCL2' AND Semester='Fall, 2003'

SELECT NameFROM Students, TakesWHERE Students.SSN=Takes.SSN AND CID='OCL2' AND Semester='Fall, 2003';

SELECT NameFROM Students, TakesWHERE Students.SSN=Takes.SSN AND CID='OCL2' AND Semester='Fall, 2003';

Page 26: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

26

Database Industry Relational databases are a great success of

theoretical ideas based on most “theoretical” type of math there is: set theory

DBMS companies are among the largest software companies in the world

Oracle, IBM (with DB2), Microsoft (SQL Server, Microsoft Access), Sybase

Also opensource: MySQL, PostgreSQL, etc. $20B+ industry XML (“semi-structured data”) also important

New lingua franca for exchanging data

Page 27: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

27

Databases are used by DB app programmers

desktop app programmers web developers

Database administrators (DBAs) design schemas security/authorization crash recovery tuning better paid than programmers!

Everyone else (perhaps indirectly)

“You may not be interested in databases, but databases are interested in you.” - Trotsky

Page 28: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

28

The Study of DBMS Several aspects:

Modeling and design of databases DBMS programming: querying and update DBMS implementation

This course covers the first two

Also will look at some more advanced areas XML, data-warehousing, regexps

Page 29: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

29

Course outline Two biggest topics:

SQL PL/SQL

But also:

Database design: Entity/Relationship models Modeling constraints

The relational model: Relational algebra Transforming E/R models to relational schemas

Page 30: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

30

Outline (Continued) SQL views and triggers Connecting to a Oracle from programming

languages Web apps Data warehousing XML

May change as course progresses partly in response to audience

Page 31: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

31

Textbook Oracle Database 10g PL/SQL 101

by Christopher Allen Hardcover: 416 pages Publisher: McGraw-Hill/

Osborne Media ISBN: 0072255404 1st edition (August 10, 2004) Distributed in class

Page 32: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

32

SQL Readings Optional reference: Oracle PL/SQL

Programming

Online (free) SQL tutorials include:

A Gentle Introduction to SQL (http://sqlzoo.net/)

SQL for Web Nerds (http://philip.greenspun.com/sql/)

Page 33: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

33

On-going Feedback Don’t wait until the class is over to give

feedback on improving it too late for you then!

Send mail if you have questions or concerns

“We’re in touch, so you be in touch.”

Page 34: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

34

So what is this course about, really?A bit of everything! Languages: SQL, XPath, XQuery Data modeling Some theory!

Functional dependencies, normal forms e.g., how to find most efficient schema for data

Writing lots of SQL queries Lots of coding in PL/SQL Business DBMS examples/cases Most importantly: how to meet real-world needs

Page 35: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

35

For right now: additional written survey Email to mpjohnson-at-gmail.com: name email previous cs/is/math/logic courses/background previous programming experience

Perl? PHP? HTML?

Job: programmer, DBA, etc. why taking class

Page 36: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

36

Agenda Last time: intro, RDBMS, ACID test This time: E/R model1. Identify entity sets, relations and attributes2. One-one, one-many, many-many relations3. Simple ER diagrams to model a situation4. 3-way relationships; Converting to binary5. Entities with multiple roles6. Subclasses Design issues1. Principles of faithfulness & simplicity in ER diagrams2. Redundancy3. Whether an element should be an attribute or entity set4. Replacing a relationships with entity sets

Page 37: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

37

DB development path

the

WorldE/R

design

Relational

schema

Relational

DB

Page 38: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

38

Entity/Relationship (E/R) Model A popular data model – useful to database

designers Graphical representation of miniworld Helps design the database, not implement it E/R design is translated to a relational design

relational design then implemented in an RDBMS

Elements of model Entities Entity Sets Attributes Relationships (!= relations!)

Page 39: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

39

Elements of E/R Model: Entity Sets Entity: like an object

e.g. President Bush Particular instance of a concept

Entity set: set of one sort of entities or a concept e.g. World leaders Generally, same set of attributes

Represented by a rectangle A “good” entity set – you decide

Common properties Correspond to class of phys. or bus. objects

(People, products, accounts, grades, etc.)

World Leader

Page 40: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

40

Elements of E/R Model: Attributes Properties of entities in entity set

Like fields in a struct Like columns in a table/spreadsheet Like data members in an object

Values in some domain (e.g., ints, strings) Represented by ovals: Assumed atomic

But could have limited structure Ints, strings, etc.

ID Name

Student

Page 41: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

41

Elements of E/R Model: Relationships Connect two or more entity sets

e.g. students enroll in courses Binary relationships: connect two entity sets –

most common Multiway relationships: connect several entity

sets Represented by diamonds

Students Enroll Courses

Page 42: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

42

Elms of E/R Model: Rel’ships (cont’d) Students Enroll in courses Courses are Held in rooms The E/R data model:

Students Enroll Courses

Held

Rooms

NameID

Page 43: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

43

A little set theory A mathematical set =a collection of members A set is defined by its members

“Are you in or are you out?” No other structure, no order, no duplicates allowed

Sets can be specified by listing: {1, 2, 3, …} = N {1, 2, George Bush} (few applications, but valid)

Or by “set-builder” notation: { x in N: 2 divides x} = ? { x in Presidents | reelected(x)} = ? {2x: x in N} = ?

Page 44: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

44

A little set theory One set can be a subset of another (which is then a

superset of it) ReelectedPresidents is a subset of Presidents Also, RP is a proper subset of Pres – some lost reelection

Given two sets X and Y, the cross product or Cartesian product isX x Y = {(x,y): x in X, y in Y}= the set of all ordered pairs in which the first comes from X and the second comes from Y

Important: (x,y) != {x,y} In an order pair or tuple

Order matters Duplicates are allowed

Page 45: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

45

A little set theory Mathematically, a relation(ship) between X and Y is

just a subset of X x Y= all those pairs (x,y) s.t. x is related to y

Example: owner-of O on People, Cats O(MPJ, Gödel) holds

The equals relation E on N, N: E(3,3) holds because 3 = 3 E(3,4) does not hold E is still a set: E = {(1,1), (2,2), (3,3), …}

Father of relation F on People, People: F(GHWB, GWB) holds F(GWB, GHWB) does not hold Relations aren’t necessarily symmetric

Page 46: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

46

Many-many

Multiplicity of Relation(ship)s

Many-one One-one

Representation of relationships No arrow: many-to-many Sharp arrow: many-to-one Rounded arrow: “exactly one”

“key constraint” One-one:

Page 47: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

47

Multiplicity of Relation(ship)s

Students Enrolls Courses

Many-to-many:

Student Live Residence hall

Many to one: a student lives in <= 1 residence hall

Many to exactly one: a student must live in a residence hall

Student Live Residence hall

Page 48: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

48

Multiplicity, set-theoretically Assume no vars below are equal Many-one means:

if (x1,y1) in R then (x1,y2) cannot be in R One-many means:

if (x1, y1) in R then (x2,y1) cannot be in R One-one means:

if (x1,y1) in R, then neither (x2,y1) nor (x1,y2) can be in R

Notice: one-one is stronger than many-one One-one implies both many-one and one-many

Page 49: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

49

E/R Diagram e.g.

Students CoursesEnrolls

ID

Name

ID

Name

Assisting

TAID

Name

Page 50: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

50

E/R Diagrams Works if each TA is a TA of all students

Student and TA connected only through Course

But what if students were divided among multiple TAs? Then a student in OCL3 would be related to only

one of the TA's for OCL3—which one? Schema doesn’t store enough info

3-way relationship is helpful here

Page 51: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

51

Multiway Relationships

Students

Courses

TAs

Enrolls

Students Courses TAsCondi C20.0046 DonaldGeorge C20.0046 DickAlberto C20.0046 Colin… … …

Enrolls entries:

NB: Enrolls determines TA:

(student, course) at most one TA

Page 52: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

52

Converting multiway relships to binary Some models limit relationships to binary Multiway relationship – equivalent collection of binary,

many to one relationships Replace relationship with connecting entity set

Students

Courses

TAs

EnrollsStudent-of

Course-of

TA- ofNB: Enrolls has no attributes!

Page 53: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

53

Second multiway e.g.: renting movies Scenario: a Customer Rents a Movie from a

VideoStore on a certain date

Q: Which entity does date belong to? A: To the fact of the renting Relationships can have attributes

always (implicitly) many-one

Rental

VideoStore

Customer

Movie

date

Page 54: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

54

Second multiway e.g.: renting movies But they don’t have to Relationship attributes can be replaced with

(trivial) new entities

Rental

VideoStore

Customer

Movie

date

Date

Page 55: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

55

Where can we draw arrows?

(store, video, customer) date ? (store, video, date) customer ? (store, date, customer) video ? (video, date, customer) store ?

Second multiway e.g.: renting movies

Rental

VideoStore

Customer

Movie

date

Page 56: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

56

Q: (Why) does it matter? Round arrow benefit:

Obvious: One item takes less space than many Less obvious: easier to access one item x than set of

one item {x} In programming: an int v. a linked list with just one int

Regular arrow benefit: Mapping to a set of either one elm or none seems bad But not implemented this way Always one element, but that value may be NULL

Lesson: it pays to identify your relship’s multiplicity

Page 57: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

57

Second multiway e.g.: renting movies

Convert to binary?Rental

VideoStore

Customer

Movie

date

Rental

Customer

Store

Movie

StoreOf

MovieOf

BuyerOf

date

Page 58: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

58

Roles in relationships Entity set appears more than once in a relship

Generally distinct entities Each appearance is in a different role Edges labeled by roles

Pre-req

Prereq

Successor

Course

Course (Pre-req)

Course (Successor)

Accounting Finance-I

Finance-I Derivatives

Finance-I Finance-II

Calculus Derivatives

Page 59: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

59

Subclasses in the E/R model Some entities are special cases of others Conversely: some are generalizations

Humans are specialized mammals Grad students are specialized students

And, in turn, specialized mammals

Subclass A isa B Represented by a triangle Always one-to-one, though arrows omitted Root is more general Multiple inheritance is allowed! A single entity may consist of all components (sets of

fields) in arbitrary ESs and their ancestors

Page 60: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

60

Subclasses

Movies

Cartoons Murder-Mysteries

isa isaVoices

Weapon

stars

length title year

Lion King

Component

“Lion King”: atts of Movies; relship Voices

“Roger Rabbit”: atts of Movies; relship Voices; att weapon

Roger Rabbit

TX Chainsaw Massacre

Page 61: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

61

E/R inheritance v. OO inheritance In a OOP class hierarchy, children also inherit

“attributes” from parents But an object is an instance of one class

In E/R, an entity may be composed of components from multiple, not-directly-related ESs Roger Rabbit is composed of components from Cartoons,

Murder Mysteries, and Movies We could create a Cartoon Murder Mysteries ES if there

were any atts specific to them

So the real difference: In E/R, can have implicit multiple inheritance between any set of IS-A-connected nodes (sharing a root)

Page 62: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

62

Next Lab 1 online

Page 63: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

63

New topic: Design Principles Faithfulness Avoiding redundancy Simplicity Choice of relationships Picking elements

Page 64: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

64

Faithfulness Is the relationship many-many or many-one? Are the attributes appropriate? Are the relationships applicable to the

entities? Examples

Courses & instructors maybe many-one, maybe many-many

Bosses & subordinates maybe one-many, maybe many-many

Page 65: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

65

Simplicity Einstein: Theories as simple as possible, but not

simpler. Use as few elements as possible

Minimum required relations No unnecessary attributes (will you be using this

attribute?) Eliminate “spinning wheels”

Example: how can we simplify this?

Movies Ownings StudiosOwned-by Owns

Page 66: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

66

Avoiding redundancy Say everything exactly once

Minimize database storage requirements More important: prevent possible update errors

simplest but not only e.g.: modify data one place but not the other – more later

Example: Spot the redundancy

Studios MoviesOwn

StudioName

Name

Length

Name

Address

Page 67: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

67

Avoiding redundancy Say everything exactly once

Minimize database storage requirements More important: prevent possible update errors

simplest but not only e.g.: modify data one place but not the other – more later

Example: Spot the redundancy

Studios MoviesOwn

StudioName

Name

Length

Name

Address

Redundancy: Movies “knows” the studio two ways

Phone

Page 68: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

68

Spot more redundancy

Different redundancy: studio info listed for every movie!

Movies

StudioName

Name

Length

SAddress

SPhone

Name Length Studio SAddress SPhonePulp Fiction … Miramax NYC 212-…Sylvia … Miramax NYC 212-…Jay & Sil. Bob … Miramax NYC 212-…

Page 69: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

69

Don’t add relships that are implied

Students Courses

TAs

Enrolls

TA-of

Assist

Suppose each course again has <=1 TA

Q: Is the following good design?

A: If TAs other than the course’s TA can help students, then yes;

if not, then no: we can connect Students and TAs by going through Courses; redundant!

Page 70: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

70

Correct E/R models may contain loops

Person plays multiple roles: employee of company buyer of product

price

address name ssn

Person

buys

makes

employs

CompanyProduct

name category

stockprice

name

Page 71: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

71

More design: attribute entity

Repeating TA names & IDs – redundant TA is not TAing any course now lose TA’s data! TAs should get their own ES

Students CoursesEnrolls

Q: What’s wrong with this design?

A:

TA-Name TA-ID

TA-Email

Course-ID CName

Page 72: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

72

Opposite problem: entity attribute Some E/Rs improved by removing entities

Can convert Entity E into attributes of F if1. R:FE is many-one

one-one counts because special case2. Attributes for E are independent of each other

knowing one att val doesn’t tell us another att val

Then remove E add all attributes of E to F

Page 73: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

73

Students CoursesEnrolls

TA-Name AssistsTA

Entity attributeCName

Room

Students CoursesEnrolls

CName

Room

TA-Name

Course-ID

Course-ID

Page 74: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

74

Convert TA entity again?

No! Multiple TAs allowed Violates condition (1) Redundant course data

Students CoursesEnrolls

AssistsTA

CName CID Room TA-NameDBMS 46 123 HowardDBMS 46 123 Wesley

CName

Room

Course-ID

TA-Name

Page 75: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

75

Convert TA entity again?

Students CoursesEnrolls

AssistsTA

CName

Room

Course-ID

TA-ID TA-Favorite-Color

No! TA has dependent fields Violates condition (2)

How can it tell? Redundant TA data

CName TA-Name TA-ID TA-ColorDBMS Ralph 678 GreenC++ Ralph 678 Green

TA-Name

Page 76: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

76

Entity or attributes? Should student address be an entity or an attribute? If student may have multiple addresses, must be entity

campus address, permanent address attributes cannot be set-valued

If we need to examine structure of address, must be entity find all students from NYS but not NYC

If attribute, then it’s probably a simple string no structure! NB: this choice is a microcosm of entire miniworld (much) power of a DB comes from the structure imposed on the

data

Page 77: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

77

Larger example DB design Application: library database. Authors have written

books about various subjects; different libraries in the system may carry these books.

Entities (with attributes in parentheses): Authors (ssn, name, phone, birthdate) Books (ISDN, title) Subjects (sname, sid) Libraries (lname)

Relations [associating entities in square brackets]: Wrote-on [Authors, Subjects] Cover [Libraries, Subjects] On [Books, Subjects]

Page 78: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

78

E/R of DB designName

Author

ssn phone birthdate

wrote-on

SubjectSNameTitle

Carries

LibraryLName

On Book

ISBN

Page 79: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

79

Poor initial design First design is a poor model of this system Some info not captured:

How many copies does a lib. have of a given book? What edition of a book does the library have?

Design problems: no direct relship associating authors and books no direct relship associating libraries and books

Common queries complex, difficult, or impossible What libraries carry books by a given author? What books has a given author written? Who is the author of a given book?

Page 80: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

80

Larger example DB design 2 Application: library database as before

Entities (with attributes in parentheses): Authors (ssn, name, phone, birthdate) Books (ISDN, title) Subjects (sname, sid) Libraries (lname)

Relations [associating entities in square brackets] (attributes in parentheses): Wrote [Authors, Books] Carries [Libraries, Books] (quantity, edition) On [Books , Subjects]

Page 81: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

81

E/R of improved DB design

Rule of thumb: often queried together make closely connected

Name

Author

ssn phone birthdate

wrote

BookISBN

TitleCarries

LibraryLName

Edition

Quantity

On Subject

SName

Page 82: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

82

Agenda Before: E/R models, design & redundancy

Constraints Identifying & specifying key attributes to an entity set Recognizing other types of single-valued constraints Representing referential integrity constraints Identifying & representing general constraints

Weak entity sets

Page 83: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

83

Next topic: Constraints Review: programmer-defined rules stating what

should always be true about consistent databases

Restrictions on data: Keys (e.g. SSNs uniquely identify people) Single value constraints (e.g. everyone has 1 father) Referential Integrity (e.g. person’s record refers to father

father must exist) Domain constraints (e.g. gender in M/F, age in 0..150) General constraints (e.g. no more than 10 customers per

sales rep) Can’t infer constraints from data

may hold “accidentally” they are a part of the schema

Page 84: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

84

E/R keys Uniquely identifies entity in ES Attribute or set of attributes

Two entities cannot agree on all key attributes These attributes determine all others

Every ES should have a key possibly including all attributes

Primary key attributes underlined More than one possible key:

Candidate keys, primary key

Practical tip: create intentional key attribute E.g. SSN, course-id, employee-id, etc. SSN likely shorter than (name,address) Prevents quasi-redundancy

address

name ssn

Person

Page 85: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

85

Single-valued constraints “at most one” value

sharp arrows E.g. attributes: could be null or one Many-one relationships: the “one” part is

single-valued. Can think of key atts as (non-null) single-

valued

TACourse Assists

Page 86: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

86

Referential integrity “Exactly one value” NOT NULL attributes Relationships

Non-null value refers to entity that exists Refer to entity with foreign key HTML analogy: no broken links Programming analogy: no dangling pointers Ways of handling deletion:

Prevent deletion as long as referrer exist Enforce deletion of all referrers

InstructorCourse Taught

Page 87: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

87

Referential integrity – E/R e.g.

Insertion – must refer to existing entity Suppose need to add

course: “OCL3” instructor: MPJ

Q: Which order? Q: What if relship were exactly-exactly, say, M(Hs,Ws)?

i.e., referential integrity in both directions? A: Put both inserts in one xact – later

Students CoursesEnrolls

Instructor

Taught

Page 88: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

88

Other kinds of constraints Domain constraints

E.g. date: must be after 1900 Enumerated type: grades A through F, no E No specific E/R notation: mention with attribute or

relationship General constraints:

A class may have no more than 100 students; a student may not have more than 6 courses:

Students CoursesEnroll <=6<=100

Page 89: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

89

Next topic: Weak entity sets Definition:

Some or all key attributes belong to another ES Why:

An entity set is part of a hierarchy (not ISA) Connecting entity sets

The key consists of 0, 1 or more of its own attributes Key attributes of entity sets from supporting

relationships

Page 90: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

90

Conditions of Supporting relationships

Supporting relationship R:EF R is many-one (E-F) (or one-one) R is binary Referential integrity from E to F

a rounded arrow Those atts supplied to E are the key attributes of F F itself may be weak

Another entity set G, and so on recursively

A1

A2

RE F

Page 91: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

91

For several supporting relships from E to F Keys of each F role appear as foreign key of E

Other many-one relationships Not necessarily supporting

Requirements for weak entity sets

Buyer

Seller

Trades A1

A2

A3

People

StoresDate

Page 92: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

92

Weak entity sets Example: Hierarchy – species & genus Idea: species name unique per genus only

Species

name

Belongs-to Genus

name

Page 93: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

93

Video store connecting entity sets e.g. was a weak entity set

Key: date, MID,SID, CID

Weak entity sets

MID

SID

CID

Rental

StoreOf

MovieOf

BuyerOf

date

Product

Store

Customer

Page 94: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

94

E/R design summary Subject/design choices:

Should a concept be modeled as an ES or an att? Should a concept be modeled as an ES or a

relship? Identifying relationships: binary or multiway?

Constraints in the ER Model: Important in determining the best design Much data semantics can (and should) be captured Normalization improves further – later

Page 95: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

95

Review: E/R example Exercise: email addresses & logins

address = username @ host mpjohnson @ gmail.com Password file stores logins, not full addresses Draw E/R diagram with weak entity set Users

supported by entity set Hosts

Could we design this differently? Why/why not?

Page 96: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

96

Agenda Before: finished E/R models per se Now: Intro to relational model Converting ER diagrams to relations Functional dependencies

Keys and superkeys in terms of FDs Finding keys for relations Rules of FDs

Normalization

Page 97: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

97

Next topic: the Relational Data Model

the

WorldE/R

design

Relational

schema

Relational

DB

Page 98: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

98

Next topic: the Relational Data Model

Database Model(E/R, other)

Relational Schema

Physicalstorage

Diagrams (E/R) Tables: column names: attributes rows: tuples

Complex file organizationand index structures.

Page 99: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

99

Relations as tables

Name Price Category Manufacturer

gizmo $19.99 gadgets GizmoWorksPower gizmo $29.99 gadgets GizmoWorksSingleTouch $149.99 photography CanonMultiTouch $203.99 household Hitachi

tuples/rows/records/entities

Attribute names Product

table/relation

Page 100: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

100

Relational terminology Relation is composed of tuples Tuples composed of attribute values

Attribute has atomic types

Relation schema: relation name + attribute names + attribute types

Relation instance: set of tuples order doesn’t matter

Database schema: set of relation schemas Database instance: relation instance for every

relation in the schema

Page 101: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

101

Relations as sets Remember: math relation is a subset of the cross-

product of the attribute value sets R subset-of S x T Product subset-of Name x Price x Cat x Mft

One member of Product relation: (gizmo, $19.99, gadgets, GizmoWorks) in Product

DB Relation instance = math relation

Q: If relations are sets, why call “instances”? A: R is a member of the powerset P(SxT)

powerset = set of all subsets

Page 102: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

102

More on tuples Formally, can also be a mapping

from attribute names to (correctly typed) values: name gizmo price $19.99 category gadgets manuf. GizmoWorks

NB: ordered tuple is equiv to mapping Both ways supported in SQL

Sometimes we refer to a tuple by itself (note order of attributes) (gizmo, $19.99, gadgets, GizmoWorks) or Product(gizmo, $19.99, gadgets, GizmoWorks).

Page 103: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

103

Updates/modifications The database maintains a current database state Modifications of data:

add a tuple delete a tuple update an attribute value in a tuple

DB Relation instance = math relation Idea: we saw partic. Product DB instance

add, delete rows different DB rel. instances technically, different math relations to DBMS, still the same relation/table

Modifications to the data are frequent Updates to the schema are rare, expensive (why?)

Page 104: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

104

E/R models to relations Recall justification:

design is easier in E/R implementation is easier/faster in R

Parallel to program compilation: design is easier in C/Java/whatever implemen. is easier/faster in machine/byte code

Strategy1. apply semi-mechanical conversion rules

2. improve by combining some relations

3. improve by normalization involves finding functional dependencies

Page 105: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

105

E/R conversion rules Relationship relation

attributes: keys of entity-sets/roles key: depends on multiplicity

Entity set … relation attributes: attributes of entity set key: key of ES

NB: mapping of types is not one-one We’ll see: mapping one tokens is not one-one

Special treatment: Weak entity sets Isa relations & subclasses

Page 106: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

106

Entity Sets Entity set Students

ssn

name

address

Students

John

Howard

Name

North Carolina444-555-6666

Park Avenue111-222-3333

AddressSSN

Rel: Students

Page 107: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

107

Entity Sets

Course

CourseID

CourseName

Page 108: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

108

Binary many-to-many relationships Key: keys of both entities

Why we learned to recognize keys

C30.0046444-555-6666

C20.0056111-222-3333

C20.0046111-222-3333

CourseIDssn

Relation: Enrolls

EnrollsS_addr

S_NameStudents Course

Course-Name

CourseID

ssn

Page 109: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

109

Many-to-one relationships

Key: keys of many entity

Movies Studiosowns

2003SyliaM202

1999Mr. Ripley.M101

YearTitleMovieID

Movies

OrlandoDisneyS73

NYCMiramaxS35

AddressNameStudioID

Studios

S35

S73

StudioID

CN22222

CN11111

CopyrightNo

M202

M101

MovieIDOwns

CopyrightNo

MovieID

Title

Year StudioID

NameAddress

Page 110: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

110

Improving on many-one Note rules applied:

Movies Rel.: all atts from Movies ES Studios Rel: all atts from Studios ES Owns Rel: att key atts from Movies & Studios ESs

But: Owns:MoviesStudios is many-one for each row in Movies, there’s a(/no) row in Owns just add the Owns data to Movies

Page 111: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

111

Many-to-one: a better design

Q: What if a movie’s Owns row were missing?

2003SyliaM202

1999Mr. Ripley.M101

YearTitleMovieID

Movies

S35

S73

StudioID

CN22222

CN11111

CopyrightNo

M202

M101

MovieID

Owns

CN22222

CN11111

CopyrightNo

S35

S73

StudioID

2003

1999

Year

SyliaM202

Talent Mr. Ripley

M101

TitleMovieID

Movies’

Page 112: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

112

Many-to-many relationships again Won’t work for many-many relationships

acts

MovieID Title Year

M101 Mr. Ripley 1999

M202 Sylia 2003

M303 P.D. Love 2002

StarID Name Address

T400 Gwyneth P. Bev.Hills

T401 P.S. Hoffman Hollywood

T402 Jude Law Palm Springs

MovieID StarID

M101 T400

M202 T400

M101 T401

M101 T402

M303 T401

Movies

Stars

Acts

Movies Stars

Page 113: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

113

Many-to-many relationships again

MovieID Title Year StarID

M101 Talented Mr. Ripley 1999 T400

M101 Talented Mr. Ripley 1999 T401

M101 Talented Mr. Ripley 1999 T402

M202 Sylia 2003 T400

M303 Punch Drunk Love 2003 T401

And here’s why:

Page 114: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

114

Multiway relationships & roles

Different roles treated as different entity sets Key: keys of the many entities

Students Courses

TAs

tutors graders

enrolls

TA_SSN Name

SSN CourseID

Name Name

Page 115: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

115

Multiway relationships & roles

Enrolls(S_SSN, Course_ID, Tutor_SSN, Grader_SSN)

SSN Name

111-11-1111 George

222-22-2222 Dick

TA_SSN Name

333-33-3333 Wesley

444-44-4444 Howard

555-55-5555 John

Students TAsCourseID Name

C20.0046 Databases

C20.0056 Software

Courses

S_SSN CourseID Tutor_SSN Grader_SSN

111-11-1111 C20.0046 333-33-3333 444-44-4444

222-22-2222 C20.0046 444-44-4444 555-55-5555

Page 116: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

116

Converting weak ESs – differences

Atts of Crew Rel are: attributes of Crew key attributes of supporting

ESs

Crew Unit-of Studio

StudioNameCrew_ID

address

C2Miramax

C1Disney

C1Miramax

Crew_IDStudioName

Crew

Supporting relships are omitted (why?)

Page 117: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

117

Weak entity sets - relationships

Crew Studio

StudioNameCrew_ID

address

Insurance

IName

Address 1260 7th Av.NYBlueCross

1250 6th Av.NYAetna

AddressIName

InsuranceSubscribes

Unit-of

Page 118: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

118

Weak entity sets - relationships Non-supporting relationships for weak ESs are

converted keys include entire weak ES key

C21

C22

C21

Crew_ID

Aetna

BlueCross

Aetna

Insurer

Universal

Disney

Universal

StudioName

Subscribes

Page 119: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

119

Conversion example Video store rental example, plus some atts

Q: Conversion to relations?

Rental

VideoStore

Customer

Movie

date

yearMNameaddress

Cname

MID

Page 120: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

120

Conversion example, continued Resulting binary-relationship version

Q: Conversion to relations?

Rental

Customer

Store

Movie

StoreOf

MovieOf

BuyerOf

dateyearMName

address

Cname

MID

Page 121: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

121

Converting inheritance hierarchies No best way

Several non-ideal methods: E/R-style: each ES relation OO-style: each possible “object” relation nulls-style: each rooted hierarchy relation

non-applicable fields filled in with nulls

Pros & cons for each method, exist situations favoring it

Page 122: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

122

Converting inheritance hierarchies

Movies

Cartoons Murder-Mysteries

isa isaVoices

Weapon

stars

length title year

Lion King

Component

Page 123: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

123

Inheritance: E/R-style conversion Each ES relation

Root entity set: Movies(title, year, length)

1301993Lion King

1988

1990

1980

Year

110

115

120

length

Roger Rabbit

Scream

Star Wars

Title

Knife1990R. Rabbit

1988

Year

Knife

murderWeapon

Scream

Title

Subclass: MurderMysteries(title, year, murderWeapon)

Subclass: Cartoons(title, year)

1993Lion King

1990

Year

Roger Rabbit

Title

Page 124: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

124

E/R-style & quasi-redundancy Name and year of Roger Rabbit were listed in three

different rows (in different tables) Suppose title changes (“Roger” “Roget”)

must change all three places Q: Is this redundancy? A: No!

name and year are independent multiple movies may have same name

Real redundancy reqs. dependency two rows agree on SSN must agree on rest

conflicting hair colors in these rows is an error two rows agree on movie title may still disagree

conflicting years may be correct – or may not be Better: introduce “movie-id” key att

Page 125: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

125

Subclasses: object-oriented approach Every possible “subtree” (what’s this?):

1. Movies

2. Movies + Cartoons

3. Movies + Murder-Mysteries

4. Movies + Cartoons + Murder-Mysteries

Title Year length

Star Wars 1980 120

Title Year length Murder-Weapon

Scream 1988 110 Knife

Title Year length

Lion King 1990 115

Title Year length Murder-Weapon

Roger Rabbit 1988 110 Knife

1. 3.

2. 4.

Page 126: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

126

Subclasses: nulls approach One relation for entire hierarchy Any non-applicable fields are NULL

Q: How do we know if a movie is a MM? Q: How do we know if a movie is a cartoon?

Title Year length Murder-Weapon

Star Wars 1980 120 NULL

Lion King 1993 130 NULL

Scream 1988 110 Knife

Roger Rabbit 1990 115 Knife

Page 127: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

127

Agenda Before: relational model

Next:

1. Functional dependencies Keys and superkeys in terms of FDs Finding keys for relations

2. Rules for combining FDs And then: anomalies & normalization

Page 128: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

128

Next topic: Functional dependencies FDs are constraints

part of the schema can’t tell from particular relation instances FD may hold for some instances “accidentally”

Finding all FDs is part of DB design Used in normalization

Page 129: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

129

Functional dependencies Definition:

Notation: Read: Ai functionally determines Bj

If two tuples agree on the attributes

A1, A2, …, AnA1, A2, …, An

then they must also agree on the attributes

B1, B2, …, BmB1, B2, …, Bm

A1, A2, …, An B1, B2, …, BmA1, A2, …, An B1, B2, …, Bm

Page 130: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

130

Typical Examples of FDs Product

name price, manufacturer

Person ssn name, age father’s/husband’s-name last-name zipcode state phone state (notwithstanding inter-state area codes)

Company name stockprice, president symbol name name symbol

Page 131: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

131

To check A B, erase all other columns; for each rows t1, t2

i.e., check if remaining relation is many-one no “divergences” i.e., if AB is a well-defined function thus, functional dependency

Functional dependencies

Bm...B1Am...A1

t1

t2

if t 1, t 2 agree here then t 1, t 2 agree here

Page 132: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

132

FDs Example

Product(name, category, color, department, price)

name colorcategory departmentcolor, category price

name colorcategory departmentcolor, category price

Consider these FDs:

What do they say?

Page 133: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

133

FDs ExampleFDs are constraints:• On some instances they hold• On others they don’t

name category color department price

Gizmo Gadget Green Toys 49

Tweaker Gadget Green Toys 99

Does this instance satisfy all the FDs?

name colorcategory departmentcolor, category price

name colorcategory departmentcolor, category price

Page 134: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

134

FDs Example

name category color department price

Gizmo Gadget Green Toys 49

Tweaker Gadget Black Toys 99

Gizmo Stationary GreenOffice-supp.

59

What about this one?

name colorcategory departmentcolor, category price

name colorcategory departmentcolor, category price

Page 135: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

135

Q: Is PositionPhone an FD here?

A: It is for this particular instance, but no, presumably not in general

Others FDs? EmpID Name, Phone, Position but Phone Position

Recognizing FDs

EmpID Name Phone PositionE0045 Smith 1234 ClerkE1847 John 9876 SalesrepE1111 Smith 9876 SalesrepE9999 Mary 1234 Lawyer

Page 136: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

136

Keys of relations {A1A2A3…An} is a key for relation R if

1. A1A2A3…An functionally determine all other attributes

Usual notation: A1A2A3…An B1B2…Bk

rels = sets distinct rows can’t agree on all Ai

2. A1A2A3…An is minimal No proper subset of A1A2A3…An functionally determines

all other attributes of R

Primary key: chosen if there are several possible keys

Page 137: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

137

Keys example Relation: Student(Name, Address, DoB,

Email, Credits) Which (/why) of the following are keys?

SSN Name, Address (on reasonable assumptions) Name, SSN Email, SSN Email

NB: minimal != smallest

Page 138: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

138

Superkeys A set of attributes that contains a key Satisfies first condition:

functionally determines every other attribute in the relation

Might not satisfy the second condition: minimality may be possible to peel away some attributes

from the superkey keys are superkeys

key are special case of superkey superkey set is superset of key set

name;ssn is a superkey but not a key

Page 139: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

139

Discovering keys for relations Relation entity set

Key of relation = (minimized) key of entity set

Relation binary relationship Many-many: union of keys of both entity sets Many(M)-one(O): only key of M (Why?) One-one: key of either entity set (but not both!)

Page 140: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

140

Example – entity sets Key of entity set = (minimized) key of relation

Student(Name, Address, DoB, SSN, Email, Credits)

Student

Name

Address

DoB

SSN

Email

Credits

Page 141: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

141

Example – many-many Many-many key: union of both ES keys

Student Enrolls Course

SSN Credits CourseID Name

Enrolls(SSN,CourseID)

Page 142: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

142

Example – many-one Key of the many ES but not of the one ES

keys from both would be non-minimal

Course MeetsIn Room

CourseID Name RoomNo Capacity

MeetsIn(CourseID,RoomNo)

Page 143: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

143

Example – one-one Keys of both ESs included in relation Key is key of either ES (but not both!)

Husbands Married Wives

SSN Name SSN Name

Married(HSSN, WSSN) or

Married(HSSN, WSSN)

Page 144: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

144

Discovering keys: multiway Multiway relationships:

Multiple ways – may not be obvious R:F,G,HE is many-one E’s key is included

but not part of key Recall that relship atts are implicitly many-one

Course Enrolls Student

CourseID Name SSN NameSection

RoomNo Capacity

Enrolls(CourseID,SSN,RoomNo)

Page 145: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

145

Rules for FDs Reasoning about FDs: given a set of FDs,

infer other FDs – useful E.g. A B, B C A C Definitions: for FD-sets S and T

T follows from S if all relation-instances satisfying S also satisfy T.

S and T are equivalent if the sets of relation-instances satisfying S and T are the same.

I.e., S and T are equivalent if S follows from T, and T follows from S.

Page 146: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

146

Combining FDsIf some FDs are satisfied, thenothers are satisfied too

If all these FDs are true:name colorcategory departmentcolor, category price

name colorcategory departmentcolor, category price

Then this FD also holds: name, category pricename, category price

Why?

Page 147: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

147

Splitting & combining FDsSplitting rule:

Combining rule:

Note: doesn’t apply to the left side

Q: Does it apply to the left side?

A1A2…An B1B2…BmA1A2…An B1B2…Bm

A1, A2, …, An B1

A1, A2, …, An B2

. . . . .A1, A2, …, An Bm

A1, A2, …, An B1

A1, A2, …, An B2

. . . . .A1, A2, …, An Bm

Bm...B1Am...A1

t1

t2

Page 148: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

148

Reflexive rule: trivial FDs

FD A1A2…An B1B2…Bk may be Trivial: Bs are a subset of As Nontrivial: >=1 of the Bs is not among the As Completely nontrivial: none of the Bs is among the As

Trivial elimination rule: Eliminate common attributes from Bs, to get an equivalent

completely nontrivial FD

A1, A2, …, An AiA1, A2, …, An Ai with i in 1..n is a trivial FD

A1 … An

t

t’

Page 149: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

149

Transitive ruleIf

and

then

A1, A2, …, An B1, B2, …, BmA1, A2, …, An B1, B2, …, Bm

B1, B2, …, Bm C1, C2, …, CpB1, B2, …, Bm C1, C2, …, Cp

A1, A2, …, An C1, C2, …, CpA1, A2, …, An C1, C2, …, Cp

A1 … Am B1 … Bm C1 ... Cp

t

t’

Page 150: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

150

Example R(A,B,C) Each of three determines other two Q: What are the FDs?

Closure of singleton sets Closure of doubletons

Q: What are the keys? Q: What are the minimal bases?

Page 151: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

151

Examples of Keys Product(name, price, category, color)

name, category price

category color

Keys are: {name, category}

Enrollment(student, address, course, room, time)student address

room, time course

student, course room, time

Keys are: [in class]

Page 152: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

152

Where are we going, where have we been? Goal: manage large amounts of data

effectively Use a DBMS must define a schema

DBMSs use the relational model But initial design is easier in E/R

Must design an E/R diagram Must then convert it to rel. model

Page 153: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

153

Where are we going, where have we been? At this pt, often find problems – redundancy

How to fix?

Convert the tables to a special “normal” form How to do this?

First step is: check which FDs there are The reason we looked at FDs last time Will have to look at all true FDs of the table Then well do decompositions

Page 154: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

154

Next Lab 2 online

Readings posted for tomorrow

Tomorrow: Briefly discuss normalization Install Oracle

Page 155: Matthew P. Johnson, OCL5, CISDD CUNY, Sept 20051 OCL4 Oracle 10g: SQL & PL/SQL Session #1 Matthew P. Johnson CISDD, CUNY June, 2005.

Matthew P. Johnson, OCL5, CISDD CUNY, Sept 2005

155

E/R to relational model

courses Depts

Computer-allocation

room

number

givenBy

name chair