INFO 340 Lecture 4 Relational Algebra and Calculus SQL Syntax.

Post on 11-Jan-2016

219 views 0 download

Transcript of INFO 340 Lecture 4 Relational Algebra and Calculus SQL Syntax.

INFO 340

Lecture 4

Relational Algebra and Calculus

SQL Syntax

Relational Algebra and Calculus

• Relational Algebra is a procedural whereas Relational Calculus is declarative.

• SQL is based on Relational Calculus. You tell server WHAT you want, not how you want to get it.

Tuple Relational Calculus

• { T | P(T) }– T is a Tuple variable– P(T) is a formula defining T– Result is the set of all tuples T where P(T) is

true

• Give example..

Domain Relational Calculus

• First Order Predicate Logic– Looks at predicates on one side and

individuals on the other.

Oh CRUD

• Create - Using INSERT

• Retrieve - Using SELECT

• Update - Using UPDATE

• Delete – Using DELETE

• How we manipulate the data. Called the Data Manipulation Language (DML).

Remember SELECT from Lab?

selectselect whatever-attributes-you-want

fromfrom the-table-that-you-want

wherewhere x-attribute = what-you-want;

selectselect last_name

fromfrom myTable

wherewhere first_name = ‘Suzie’;

SQL and Relational Calculus

• SQL is grounded in Relational Calculus. You tell the DBMS WHAT you want and it figures out how best to retrieve it.

• Well, that’s not entirely true.. SELECT does break Relational Theory but that’s for another day…

Two tables for today’s examples

ID Name

1 HR

2 Sales

3 Engineering

4 Marketing

DEPARTMENT TABLE

LastName DepartmentID

Smith 1

Johnson 1

Miller 2

Lee 3

EMPLOYEE TABLE

Cross JoinCross joins are the Cartesian product of two tables. There are two ways

to express cross joins.

SELECTSELECT * FROMFROM Employee E, Department D

SELECTSELECT * FROMFROM Employee E CROSS JOIN CROSS JOIN Department D

ID Name

1 HR

2 Sales

3 Engineering

4 Marketing

DEPARTMENT TABLE

LastName DeptID

Smith 1

Johnson 1

Miller 2

Lee 3

EMPLOYEE TABLE

E.LastName E.DeptID D.ID D.NameSmith 1 1 HR

Smith 1 2 Sales

Smith 1 3 Engineering

Smith 1 4 Marketing

Johnson 1 1 HR

Johnson 1 2 Sales

Johnson 1 3 Engineering

Johnson 1 4 HR

Miller 2 1 HR

Miller 2 2 Sales

Miller 2 3 Engineering

Miller 2 4 Marketing

Lee 3 1 HR

Lee 3 2 Sales

Lee 3 3 Engineering

Lee 3 4 Marketing

Inner JoinInner joins are the most common type of Join performed in SQL.

There are actually two ways to express an Inner. An inner join is done by taking the Cartesian product of the two tables, then only returning the rows that match the conditional.

SELECTSELECT * FROMFROM Employee E, Department D

WHEREWHERE D.ID=E.DeptID

SELECTSELECT * FROMFROM Employee E

JOINJOIN Department D ONON D.ID=E.DeptID

ID Name

1 HR

2 Sales

3 Engineering

4 Marketing

DEPARTMENT TABLE

LastName DeptID

Smith 1

Johnson 1

Miller 2

Lee 3

EMPLOYEE TABLEE.LastName E.DeptID D.ID D.Name

Smith 1 1 HR

Johnson 1 1 HR

Miller 2 2 Sales

Lee 3 3 Engineering

Outer JoinOuter joins are used to return all rows from one table regardless of a

match in the other table. If no match is found in the other table, a NULL is returned. Three types: LEFT, RIGHT, FULL

Show all the departments and the employees in them, if any:

SELECTSELECT * FROMFROM Department D

LEFT JOIN LEFT JOIN Employee E ONON D.ID=E.DeptID

ID Name

1 HR

2 Sales

3 Engineering

4 Marketing

DEPARTMENT TABLE

LastName DeptID

Smith 1

Johnson 1

Miller 2

Lee 3

EMPLOYEE TABLE

D.ID D.Name E.LastName E.DeptID

1 HR Smith 1

1 HR Johnson 1

2 Sales Miller 2

3 Engineering Lee 3

4 Marketing NULL NULL

Self Join

 A self-join is a query in which a table is joined (compared) to itself.  Self-joins are used to compare values in a column with other values in the same column in the same table.  One practical use for self-joins:  obtaining running counts and running totals in an SQL query. To write the query, select from the same table listed twice with different aliases, set up the comparison, and eliminate cases where a particular value would be equal to itself. Example           Which customers are located in the same state (column name is Region)?  SELECT DISTINCT c1.ContactName, c1.Address, c1.City, c1.Region            FROM Customers AS c1, Customers AS c2           WHERE c1.Region = c2.Region  AND c1.ContactName <> c2.ContactName            ORDER BY c1.Region, c1.ContactName;

Another example: Exercise Which customers are located in the same city?  (32 rows) 

http://www.udel.edu/evelyn/SQL-Class3/SQL3_self.html

Aggregate Functions

• While returning rows is nice, often times you want to return data based upon a computed value from a set. – Count– Sum– Min– Max– Avg

An example of Aggregates

Name Grade

Steve 2.5

John 3.5

Wendy 3.8

Niki 4.0

Kevin 1.4

SELECT SELECT count(*), max(grade), min(grade), avg(grade), sum(grade) FROMFROM student_grades

Count(*) Max(grade) Min(grade) Avg(grade) Sum(grade)

5 4 1.4 3.04 15.2

SELECT Statement - Grouping

• Now that you have aggregate functions, they become useful in grouping results.

• Back to the example Join tables, maybe you want a count of the number of employees in each department.

• The GROUP BY clause is added to the end of the SELECT statement.

SELECT Statement - Grouping

• All column names in SELECT list must appear in GROUP BY clause unless name is used only in an aggregate function.

• If WHERE is used with GROUP BY, WHERE is applied first, then groups are formed from remaining rows satisfying predicate.

• ISO considers two nulls to be equal for purposes of GROUP BY.

Group By ExampleHow many employees are in each department?

SELECTSELECT D.Name, COUNT(E.DeptID) FROMFROM Department D

LEFT JOIN LEFT JOIN Employee E ONON D.ID=E.DeptID

GROUP BYGROUP BY D.Name

ID Name

1 HR

2 Sales

3 Engineering

4 Marketing

DEPARTMENT TABLE

LastName DeptID

Smith 1

Johnson 1

Miller 2

Lee 3

EMPLOYEE TABLE

D.Name COUNT(E.DeptID)

HR 2

Sales 1

Engineering 1

Marketing 0

HAVING clause

• But what if we want to return results based upon a GROUP BY? Enter the HAVING clause.

• Let’s only see the departments with people in them:

SELECTSELECT D.Name, COUNT(E.DeptID)

FROMFROM Department D

LEFT JOIN LEFT JOIN Employee E ONON D.ID=E.DeptID

GROUP BYGROUP BY D.Name

HAVINGHAVING COUNT(E.DeptID) > 0

ID Name

1 HR

2 Sales

3 Engineering

4 Marketing

DEPARTMENT TABLE

LastName DeptID

Smith 1

Johnson 1

Miller 2

Lee 3

EMPLOYEE TABLE

D.Name COUNT(E.DeptID)

HR 2

Sales 1

Engineering 1

ORDER BY clause• Finally, what if we want some order imposed on our

results? • Order by can contain any field or value specified in the

selection criteria.

SELECTSELECT D.Name, COUNT(E.DeptID)

FROMFROM Department D

LEFT JOIN LEFT JOIN Employee E ONON D.ID=E.DeptID

GROUP BYGROUP BY D.Name

HAVINGHAVING COUNT(E.DeptID) > 0

ORDER BYORDER BY D.NameID Name

1 HR

2 Sales

3 Engineering

4 Marketing

DEPARTMENT TABLE

LastName DeptID

Smith 1

Johnson 1

Miller 2

Lee 3

EMPLOYEE TABLE

D.Name COUNT(E.DeptID)

Engineering 1

HR 2

Sales 1

Set Theory Review

• Intersection of 2 SetsR = {1,2,3,4} S = {4,5,6,7} R S = { 4 }

R = { Joe, Suzie } S = { Jane, Bob, Sam } R S = { Ø }

R = { ‘big stereo’, ‘blue’, ‘safe’ } S = { ‘blue’, ‘fire trap’ , ‘AM radio’ }

R S = { ‘blue’ }

The Intersection

Set Theory Review

• Union of 2 SetsR = {1,2,3,4} S = {4,5,6,7} R S = {1,2,3,4,5,6,7}

R = { Joe, Suzie } S = { Jane, Bob, Sam }

R S = {Joe, Suzie, Jane, Bob, Sam }

R = { ‘big stereo’, ‘blue’, ‘safe’ } S = { ‘AM radio’, ‘blue’, ‘fire trap’ }

R S = {‘big stereo’, ‘blue’, ‘safe’, ‘AM radio’, ‘fire trap’ }

Set Theory Review

• Difference of 2 SetsR = {1,2,3,4} S = {4,5,6,7} R \ S = {1,2,3}

R = { Joe, Suzie } S = { Jane, Bob, Sam } R S = {Joe, Suzie }

R = { ‘big stereo’, ‘blue’, ‘safe’ } S = { ‘AM radio’, ‘blue’, ‘fire trap’ }

R S = {‘big stereo’, ‘safe’ }

The Difference

• Use between select clauses– Keyword for union is union– Keyword for intersection is intersect– Keyword for difference is except

• Column names must match in each query.• Example:

(selectselect Name fromfrom Staff) unionunion (selectselect Name fromfrom Faculty)

Union, Intersect, and Difference (Except)

INSERT

INSERT INTO INSERT INTO TableName [ (columnList) ]

VALUESVALUES (dataValueList)

• columnList is optional; if omitted, SQL assumes a list of all columns in their original CREATE TABLE order.

• Any columns omitted must have been declared as NULL when table was created, unless DEFAULT was specified when creating column.

© Pearson Education Limited 1995, 2005

INSERT

• dataValueList must match columnList as follows:– number of items in each list must be same;– must be direct correspondence in position of items

in two lists;– data type of each item in dataValueList must be

compatible with data type of corresponding column.

© Pearson Education Limited 1995, 2005

INSERT … VALUES

• Insert a new row into Employee table supplying data for all columns.– Let’s finally put someone in the marketing

department!• Full table, so can omit the column names:

INSERT INTOINSERT INTO Employee VALUESVALUES (‘Brown’, 4);• Or we can explicitly list the column names:

INSERT INTOINSERT INTO Employee (LastName, DeptID) VALUESVALUES (‘Brown’, 4);

• Perhaps we the DeptID field allows NULLs or has a default:INSERT INTOINSERT INTO Employee (LastName) VALUESVALUES (‘Brown’);

UPDATE

UPDATEUPDATE TableName

SETSET columnName1 = dataValue1

[, columnName2 = dataValue2...]

[WHEREWHERE searchCondition]

• TableName can be name of a base table or an updatable view.

• SET clause specifies names of one or more columns that are to be updated.

• WHERE clause is optional, if omitted all rows are updated.

© Pearson Education Limited 1995, 2005

UPDATE example

• Ms. Johnson gets married and wants to change her name to Anderson.UPDATE UPDATE EMPLOYEE SETSET LastName=‘Anderson’ WHEREWHERE LastName=‘Johnson’

• Better way to find Ms. JohnsonUPDATE UPDATE EMPLOYEE SETSET LastName=‘Anderson’ WHEREWHERE LastName=‘Johnson’ AND DeptID=1

• The Marketing department is being merged with Sales and as such all the employees in that department arebeing moved into Sales. UPDATE UPDATE EMPLOYEE SETSET DeptID=2 WHEREWHERE DeptID=4

ID Name

1 HR

2 Sales

3 Engineering

4 Marketing

DEPARTMENT TABLE

LastName DeptID

Smith 1

Johnson 1

Miller 2

Lee 3

EMPLOYEE TABLE

DELETE

– DELETE FROM TableName – [WHERE searchCondition]

• TableName can be name of a base table or an updatable view.

• searchCondition is optional; if omitted, all rows are deleted from table. This does not delete table. If search_condition is specified, only those rows that satisfy condition are deleted.

© Pearson Education Limited 1995, 2005

DELETE example

• Mr. Smith decides to take another job and quits:DELETE FROM DELETE FROM EMPLOYEE WHEREWHERE LastName=‘Smith’ AND DeptId=1

• Remember the Marketing department? Well, rather than merge with Sales we are going to eliminate it and all the employees in thatdepartment. DELETE FROM DELETE FROM EMPLOYEE WHERE WHERE DeptID=4

ID Name

1 HR

2 Sales

3 Engineering

4 Marketing

DEPARTMENT TABLE

LastName DeptID

Smith 1

Johnson 1

Miller 2

Lee 3

EMPLOYEE TABLE

Variants & Like

• There is a rich set of functions that can be used in SQL. Of course, most of them are highly language-variant dependent.

• LIKE. Allows searching a text field for a value. SELECT SELECT * FROMFROM students WHEREWHERE name LIKELIKE ‘R%’ – % is a wildcard, whereas _ matches just one

character

CASE statements

• SELECTSELECT CASE Sex

WHEN ‘M’ THEN ‘Male’WHEN ‘F’ THEN ‘Female’

END CASEFROMFROM Students

Mini-Project

• Due Feb 4, 2009– Build on your iSchool MySQL account

• Choose between the following:

– UW OnTech Archive– UW Privacy Policy Set

Mini-Project

UW OnTech Archive -- http://www.washington.edu/computing/ontech/archive.php

UW Privacy Policy – http://depts.washington.edu/comply/privacy.shtml

http://security.uwmedicine.org/policies/sec_policies.asp

UW OnTech

archive contents contents by issue

UW OnTech

issue contents article example

• Sample questions:– How many contents by issue pages list topics

that are not the same as the topics on the corresponding issue contents pages ? Or are missing entirely ?

– How many pages list an ‘exposed e-mail address’ on its readable page?

– How many pages have an e-mail address that is visible in the page source?

UW OnTech

UW OnTech

• More sample questions:– What is the average number of clickable links

per article in the archive ?– What is the min & max number of clickable

links in the archive• Which articles were they

– More to come

UW Privacy & Security Policies

• Sample questions:– Which policies have the greatest distance

between Effective Date & Review Date ?• Which ones are they?

– How many policies have the same Effective Date & Review Date?

• Which ones are they?

– Which policies have more than 5 attachments?

• More sample questions:– Which policies have greater than 5

references?• Which policies are they ?

• What is most often cited reference in the reference section ?

• Do any of the policies in these two sets reference each other ?

UW Privacy & Security Policies