CM2020: Introduction to Database Systems Queries In Relational Databases Database Systems 4 th...

42
CM2020: Introduction to Database Systems Queries In Relational Databases Database Systems 4 th edition Connolly and Begg Chapter 5 Dr Nirmalie Wiratunga [email protected]

Transcript of CM2020: Introduction to Database Systems Queries In Relational Databases Database Systems 4 th...

Page 1: CM2020: Introduction to Database Systems Queries In Relational Databases Database Systems 4 th edition Connolly and Begg Chapter 5 Dr Nirmalie Wiratunga.

CM2020: Introduction to Database Systems

Queries In Relational Databases

Database Systems 4th edition

Connolly and Begg Chapter 5

Dr Nirmalie [email protected]

Page 2: CM2020: Introduction to Database Systems Queries In Relational Databases Database Systems 4 th edition Connolly and Begg Chapter 5 Dr Nirmalie Wiratunga.

Tutorials and Labs• Jointly organised with Dr Robin Boswell ([email protected]) and

Hatem Ahriz ([email protected])

• Tutorials: Tuesdays 2-3pm– A (Robin): C39/39a SAS (CS2, CIM2)– B (Nirmalie): C411 Schoolhill (BIS2, MMD3, CIMD3)– C (Hatem):B40 SAS (ISTD2, ISTD3, CGA2)

• Labs: Tuesday 3-5pm– A (Nirmalie): C18 SAS (CS2, CIM2)– B (Hatem): C24 SAS (BIS2, MMD3, CIMD3)– C (Robin): C26 St Andrew street (ISTD2, ISTD3, CGA2)

Page 3: CM2020: Introduction to Database Systems Queries In Relational Databases Database Systems 4 th edition Connolly and Begg Chapter 5 Dr Nirmalie Wiratunga.

Last week …

student# name address age course100 Bobby Dundee 17 C100200 Helen Aberdeen 21 C100300 Helen Stirling 18 C200400 Freddy Perth 30 C300

course# course_name location C100 Computing St Andrews St C200 Pharmacy Schoolhill C300 Architecture Garthdee

• Relational Databases (DB)s Consists of relations or tables

• Tables consists of rows and columns

• Primary Key uniquely identifies each row

• Composite primary keys are allowed and formed by combining multiple columns

• Foreign key is an attribute / set of attributes, within one relation that matches the primary key of another relation

Table: Students

Table: Courses

Page 4: CM2020: Introduction to Database Systems Queries In Relational Databases Database Systems 4 th edition Connolly and Begg Chapter 5 Dr Nirmalie Wiratunga.

SQL

• Structured Query Language (SQL)• Data Manipulation Language

– SELECT • Single table• Multi table

– MS Access (Query by Example)– GROUP BY with SELECT– INSERT, UPDATE, DELETE

• Data Definition Language– CREATE TABLE, DROP TABLE, ALTER TABLE

Page 5: CM2020: Introduction to Database Systems Queries In Relational Databases Database Systems 4 th edition Connolly and Begg Chapter 5 Dr Nirmalie Wiratunga.

• Querying allows retrieval of data in response to a question about the data in the DB

• Relational DBMSs translates queries into SQL (a.k.a Structured Query Language or SEQUEL)

• SQL’s SELECT construct

– extracts data from the tables of the database

– allows specification of conditions

– does not facilitate modification to the tables or the data contained in tables

Querying with SELECT

Page 6: CM2020: Introduction to Database Systems Queries In Relational Databases Database Systems 4 th edition Connolly and Begg Chapter 5 Dr Nirmalie Wiratunga.

SELECT

SELECT {* | [column1 [AS newName]] [,column2, …]} [ALL | DISTINCT]

FROM table1 [alias] [, table2,…][WHERE "conditions"][GROUP BY "column-list"][HAVING "conditions"][ORDER BY "column-list" [ASC | DESC] ];

Everything within [ ] is optional

Standard syntax is:

; denotes end of SELECT statement

Page 7: CM2020: Introduction to Database Systems Queries In Relational Databases Database Systems 4 th edition Connolly and Begg Chapter 5 Dr Nirmalie Wiratunga.

Single-Table Select Queries

Driversdriver# name address pointsD010 Fred Perth 7D020 Jimmy Dundee 4D030 David Dundee 2D040 John Stirling 3D050 Bobby Aberdeen 12

The result of a query is another table - this table is not permanent - this table is called a DYNASET

(a.k.a Dynamic Data Set) in Microsoft Access

Page 8: CM2020: Introduction to Database Systems Queries In Relational Databases Database Systems 4 th edition Connolly and Begg Chapter 5 Dr Nirmalie Wiratunga.

SELECT

Drivers

driver# name address pointsD010 Fred Perth 7D020 Jimmy Dundee 4D030 David Dundee 2D040 John Stirling 3D050 Bobby Aberdeen 12

EXAMPLE: List all the ids (driver#) from the Drivers table.

SELECT driver#FROM Drivers;

driver#D010 D020 D030 D040D050

SELECT driver#

Page 9: CM2020: Introduction to Database Systems Queries In Relational Databases Database Systems 4 th edition Connolly and Begg Chapter 5 Dr Nirmalie Wiratunga.

SELECT cont

Drivers

driver# name address points

D010 Fred Perth 7D020 Jimmy Dundee 4D030 David Dundee 2D040 John Stirling 3D050 Bobby Aberdeen 12

EXAMPLE: List all the ids (driver#) and points from the Drivers table

SELECT driver#, pointsFROM Drivers;

driver# pointsD010 7D020 4D030 2D040 3D050 12

SELECT driver#, points

We use commas to

separate fields

Page 10: CM2020: Introduction to Database Systems Queries In Relational Databases Database Systems 4 th edition Connolly and Begg Chapter 5 Dr Nirmalie Wiratunga.

SELECT with Wildcards

SELECT *FROM Drivers;

Drivers

driver# name address points

D010 Fred Perth 7D020 Jimmy Dundee 4D030 David Dundee 2D040 John Stirling 3D050 Bobby Aberdeen 12

driver# name address points

D010 Fred Perth 7D020 Jimmy Dundee 4D030 David Dundee 2D040 John Stirling 3D050 Bobby Aberdeen 12

EXAMPLE: List all the details (i.e. all the fields) from the Drivers table

SELECT *

We use * for all fields

Page 11: CM2020: Introduction to Database Systems Queries In Relational Databases Database Systems 4 th edition Connolly and Begg Chapter 5 Dr Nirmalie Wiratunga.

SELECT with WHERE

Drivers

driver# name address points

D010 Fred Perth 7D020 Jimmy Dundee 4D030 David Dundee 2D040 John Stirling 3D050 Bobby Aberdeen 12

EXAMPLE: List all the ids (driver#) of the drivers from Dundee from the Drivers table

SELECT driver#FROM DriversWHERE address = “Dundee”;

driver#

D020 D030

SELECT driver#

WHERE address = “Dundee”;

Page 12: CM2020: Introduction to Database Systems Queries In Relational Databases Database Systems 4 th edition Connolly and Begg Chapter 5 Dr Nirmalie Wiratunga.

Extending SELECT• We can extend SELECT in a number of ways:

1. Can pattern match within certain fields of a table

- can use the LIKE clause

2. Can check if a field has a value from a set of values

- can use the IN operator

3. Can perform calculations on values in a table

- can use COLUMN FUNCTIONS

4. Can make your queries have a choice or be more specific

- can use the OR and AND clauses

5. Can sort output of a query into ascending/descending order

- can use the ORDER BY clause

Page 13: CM2020: Introduction to Database Systems Queries In Relational Databases Database Systems 4 th edition Connolly and Begg Chapter 5 Dr Nirmalie Wiratunga.

The LIKE clause• Sometimes we may want to pattern

match within certain fields of a table

? matches a single character Like ‘Sm?th’

* matches any number of characters

Like St*

# matches a single digit [0-9] Like ‘D0#0’

[list] matches any character in list Like ‘[A-F]*’

[!list] matches any character not in list

Like ‘[!A-F]*’

Page 14: CM2020: Introduction to Database Systems Queries In Relational Databases Database Systems 4 th edition Connolly and Begg Chapter 5 Dr Nirmalie Wiratunga.

SELECT LIKE

Drivers

driver# name address points

D010 Fred Perth 7D020 Jimmy Dundee 4D030 David Dundee 2D040 John Stirling 3D050 Bobby Aberdeen 12

EXAMPLE: List all the names and addresses of the drivers whose address begins with the letters ‘Du’ from the Drivers table

SELECT name, addressFROM DriversWHERE address Like ‘Du*’;

name address

jimmy DundeeDavid Dundee

SELECT name, addressWHERE address Like ‘Du*’;

Page 15: CM2020: Introduction to Database Systems Queries In Relational Databases Database Systems 4 th edition Connolly and Begg Chapter 5 Dr Nirmalie Wiratunga.

The IN Operator

Driversdriver# name address points

D010 Fred Perth 7D020 Jimmy Dundee 4D030 David Dundee 2D040 John Stirling 3D050 Bobby Aberdeen 12

The IN operator is used to check if an attribute(s) has a value from a set of values

[NOT] IN (<value-list>)

name

JimmyDavidBobby

Syntax is:

a) List the names of drivers from Dundee or AberdeenSELECT nameFROM DriversWHERE address IN (‘Dundee’, ‘Aberdeen’);

b) List the names of drivers not from Dundee or AberdeenSELECT nameFROM DriversWHERE address NOT IN (‘Dundee’, ‘Aberdeen’);

name

FredJohn

Page 16: CM2020: Introduction to Database Systems Queries In Relational Databases Database Systems 4 th edition Connolly and Begg Chapter 5 Dr Nirmalie Wiratunga.

Column / Aggregate FunctionsColumn functions perform calculations on the values in the column of a table

COUNT - calculates the number of values in a column

SUM - calculates the sum (total) of all values in a column

AVG - calculates the average of all values in a column

MAX - calculates the maximum value in a column

MIN - calculates the minimum value in a column

Page 17: CM2020: Introduction to Database Systems Queries In Relational Databases Database Systems 4 th edition Connolly and Begg Chapter 5 Dr Nirmalie Wiratunga.

SELECT with Aggregates

Driversdriver# name address points

D010 Fred Perth 7D020 Jimmy Dundee 4D030 David Dundee 2D040 John Stirling 3D050 Bobby Aberdeen 12

EXAMPLE: Calculate the number of drivers, the total number of points, the average number of points, the maximum points value and the range of points from the Driver

SELECT COUNT(driver#), SUM(points), AVG(points), MAX(points), MIN(points), MAX(points) - MIN(points)FROM Drivers;

COUNT(driver#) SUM(points) AVG(points) MAX(points) MIN(points) MAX(points) - MIN(points)

5 28 5.6 12 2 10

Page 18: CM2020: Introduction to Database Systems Queries In Relational Databases Database Systems 4 th edition Connolly and Begg Chapter 5 Dr Nirmalie Wiratunga.

SELECT with ORDER BY

Driversdriver# name address points

D010 Fred Perth 7D020 Bobby Dundee 4D030 David Dundee 2D040 John Stirling 3D050 Jimmy Aberdeen 12

List in alphabetical order, names of drivers with a Dundee or Aberdeen address

SELECT nameFROM DriversWHERE address IN (‘Dundee’, ‘Aberdeen’)ORDER BY name;

name

BobbyDavidJimmy

NOTE: Default is ascending

SELECT nameIN (‘Dundee’, ‘Aberdeen’);

Page 19: CM2020: Introduction to Database Systems Queries In Relational Databases Database Systems 4 th edition Connolly and Begg Chapter 5 Dr Nirmalie Wiratunga.

SELECT with ORDER BYList in reverse alphabetical order, names of drivers with a Dundee or Aberdeen address

SELECT nameFROM DriversWHERE address IN (‘Dundee’, ‘Aberdeen’)ORDER BY name DESC;

name

JimmyDavidBobby

Driversdriver# name address points

D010 Fred Perth 7D020 Bobby Dundee 4D030 David Dundee 2D040 John Stirling 3D050 Jimmy Aberdeen 12

IN (‘Dundee’, ‘Aberdeen’);SELECT name

DESC means descending

Page 20: CM2020: Introduction to Database Systems Queries In Relational Databases Database Systems 4 th edition Connolly and Begg Chapter 5 Dr Nirmalie Wiratunga.

Complex Conditions with WHERE

• Mathematical operators– =, <, >, <>, <=, >=

• Logical operatorsAND, OR, NOT

Page 21: CM2020: Introduction to Database Systems Queries In Relational Databases Database Systems 4 th edition Connolly and Begg Chapter 5 Dr Nirmalie Wiratunga.

Combining Operators AND

Driversdriver# name address points

D010 Fred Perth 7D020 Jimmy Dundee 4D030 David Dundee 2D040 John Stirling 3D050 Bobby Aberdeen 12

EXAMPLE: List the names of drivers whose licence points are between 4 and 15 from the Drivers table and not from Dundee

SELECT nameFROM DriversWHERE points >= 4 AND points <= 15AND (address <> ‘Dundee');

name

FredBobby

SELECT name

WHERE points >= 4 AND points <= 15AND (address <> ‘Dundee’);

Page 22: CM2020: Introduction to Database Systems Queries In Relational Databases Database Systems 4 th edition Connolly and Begg Chapter 5 Dr Nirmalie Wiratunga.

Driversdriver# name address points

D010 Fred Perth 7D020 Jimmy Dundee 4D030 David Dundee 2D040 John Stirling 3D050 Bobby Aberdeen 12

EXAMPLE: List the names of drivers with addresses from either Aberdeen or Dundee

SELECT nameFROM DriversWHERE (address LIKE ‘D*') OR (address LIKE ‘A*');

name

JimmyDavidBobby

SELECT nameWHERE (address LIKE ‘D*’)OR (address LIKE ‘A*’)

Combining Operators OR

Page 23: CM2020: Introduction to Database Systems Queries In Relational Databases Database Systems 4 th edition Connolly and Begg Chapter 5 Dr Nirmalie Wiratunga.

Mathematical Operators• supports the following

– +, - , * , / (division), % (modulo)

item# qty price

Shoes 2 20Raft 1 60Skateboard 10 20Life Vest 1 5

SELECT item, qty*price AS TotalCost FROM Items_ordered;

item# TotalCost

Shoes 40Raft 60Skateboard 200Life Vest 5

Items_ordered

Use “AS” to specify new

column name

Page 24: CM2020: Introduction to Database Systems Queries In Relational Databases Database Systems 4 th edition Connolly and Begg Chapter 5 Dr Nirmalie Wiratunga.

Summary: Single Table Queries• Queries may be created in a DBMS using SQL SELECT statement

• The basic ANSI syntax is:

SELECT <list of attributes>FROM <table-name>[ WHERE <condition(s)> ] ;

• Can pattern match within certain fields of a table

- LIKE clause

• Can check if a field has a value from a set of values

- IN operator

• Can perform calculations on values in a table

- COLUMN FUNCTIONS

• Can make your queries have a choice or be more specific

- OR and AND clauses

• Can sort output of a query into ascending/descending order

- ORDER BY clause

Page 25: CM2020: Introduction to Database Systems Queries In Relational Databases Database Systems 4 th edition Connolly and Begg Chapter 5 Dr Nirmalie Wiratunga.

Multi-Table Queries

• Commonly require data from more than one table

• Involves JOINING tables to form a SUPERTABLE

• Table JOINS– PRODUCT or CARTESIAN– INNER JOIN or EQUI JOIN– NATURAL JOIN– OUTER JOIN (LEFT and RIGHT)

Page 26: CM2020: Introduction to Database Systems Queries In Relational Databases Database Systems 4 th edition Connolly and Begg Chapter 5 Dr Nirmalie Wiratunga.

Multi-Table Select Queries

• List data on all orders including customer names and addresses

Customerscust# name address

C100 Allan AberdeenC101 John DundeeC102 Betty StirlingC200 Dean Dundee

Ordersorder# cust# date

2000 C100 20/11/013000 C101 27/11/014000 C100 30/11/01

Page 27: CM2020: Introduction to Database Systems Queries In Relational Databases Database Systems 4 th edition Connolly and Begg Chapter 5 Dr Nirmalie Wiratunga.

Product Join• forms every possible combination of

rows combining all columnsE.g Customer PRODUCT Order

C.cust# name address order# O.cust# date

C100 Allan Aberdeen 2000 C100 20/11/01C101 John Dundee 2000 C100 20/11/01C102 Betty Stirling 2000 C100 20/11/01C200 Dean Dundee 2000 C100 20/11/01C100 Allan Aberdeen 3000 C101 27/11/95

etc

SELECT *FROM Customer, Order;

Page 28: CM2020: Introduction to Database Systems Queries In Relational Databases Database Systems 4 th edition Connolly and Begg Chapter 5 Dr Nirmalie Wiratunga.

Inner Join (or Equi-Join)

1. Apply a product join2. Retrieve rows with matching linked attribute

values

E.g Customer INNER JOIN Order

C.cust# name address order# O.cust# date

C100 Allan Aberdeen 2000 C100 20/11/01C101 John Dundee 3000 C101 27/11/01C100 Allan Aberdeen 4000 C100 30/11/01

SELECT *FROM Customer C, Order OWHERE C.cust# = O.cust#;

Inner join is a more useful kind of join for data retrieval

Use of alias. Can also use “AS”

here. E.g. Order AS O

Page 29: CM2020: Introduction to Database Systems Queries In Relational Databases Database Systems 4 th edition Connolly and Begg Chapter 5 Dr Nirmalie Wiratunga.

Natural Join

C.cust# name address order# date

C100 Allan Aberdeen 2000 20/11/01C101 John Dundee 3000 27/11/01C100 Allan Aberdeen 4000 30/11/01

SELECT C.cust, name, address, order#, dateFROM Customer C, Order OWHERE C.cust# = O.cust#;

Similar to Inner join but with a single matching attributee.g. no O.cust#

Page 30: CM2020: Introduction to Database Systems Queries In Relational Databases Database Systems 4 th edition Connolly and Begg Chapter 5 Dr Nirmalie Wiratunga.

Queries on joined tables

cust# name address order# date

C101 John Dundee 3000 27/11/01C100 Allan Aberdeen 4000 30/11/01

SELECT C.cust#, name, order#, dateFROM Customer AS C, Order AS OWHERE Customer.cust# = Orders.cust#AND date > 25/11/01;

Page 31: CM2020: Introduction to Database Systems Queries In Relational Databases Database Systems 4 th edition Connolly and Begg Chapter 5 Dr Nirmalie Wiratunga.

OUTER JOIN

• Use when we want to see the details from unmatched rows as well– columns of the output corresponding

to unavailable data are set to null• Right Outer JOIN

– E.g. Not all customers have placed orders in the Orders table

• Left Outer Join

Page 32: CM2020: Introduction to Database Systems Queries In Relational Databases Database Systems 4 th edition Connolly and Begg Chapter 5 Dr Nirmalie Wiratunga.

Left Outer Join• rows in the 1st table with unmatched rows in the 2nd

table are still included with NULL values

E.g Customer LEFT OUTER JOIN Order

cust# name address order# date

C100 Allan Aberdeen 2000 20/11/01C101 John Dundee 3000 27/11/01C100 Allan Aberdeen 4000 30/11/01C102 Betty Stirling NULL NULLC200 Dean Dundee NULL NULL

SELECT Customer.cust#, name, order#, dateFROM CustomerLEFT JOIN orderON Customer.cust# = Orders.cust#;

Page 33: CM2020: Introduction to Database Systems Queries In Relational Databases Database Systems 4 th edition Connolly and Begg Chapter 5 Dr Nirmalie Wiratunga.

Summary Multi-Table Joins

• It is possible to join as many tables as you want– generally for n tables n-1 joins

• Inner Join (and Natural join) is most common– outer joins will display unmatched

rows

Page 34: CM2020: Introduction to Database Systems Queries In Relational Databases Database Systems 4 th edition Connolly and Begg Chapter 5 Dr Nirmalie Wiratunga.

Query Processor

• Extracts required information from database

• How?– Structured Query Language (SQL)– Query by Example (QBE)

• Querying without having to know SQL

Page 35: CM2020: Introduction to Database Systems Queries In Relational Databases Database Systems 4 th edition Connolly and Begg Chapter 5 Dr Nirmalie Wiratunga.

QBE with MS AccessE.g. Inner Join

Page 36: CM2020: Introduction to Database Systems Queries In Relational Databases Database Systems 4 th edition Connolly and Begg Chapter 5 Dr Nirmalie Wiratunga.

MS Access SQL View

Fu

ncti

on

ally S

imilar

Page 37: CM2020: Introduction to Database Systems Queries In Relational Databases Database Systems 4 th edition Connolly and Begg Chapter 5 Dr Nirmalie Wiratunga.

SQL - Next

• Structured Query Language (SQL)• Data Manipulation Language

– SELECT • Single table• Multi table

– MS Access (Query by Example)– GROUP BY with SELECT– INSERT, UPDATE, DELETE

• Data Definition Language– CREATE TABLE, DROP TABLE, ALTER TABLE

Page 38: CM2020: Introduction to Database Systems Queries In Relational Databases Database Systems 4 th edition Connolly and Begg Chapter 5 Dr Nirmalie Wiratunga.

Grouping Results

• Useful to summarise results

• Closely integrated with SELECT– Specify column names– Aggregate functions e.g. count, avg, sum, max, min– WHERE conditions– Any combination of the above

• All columns names in the SELECT must appear in the GROUP BY– HAVING clause used to specify conditions on groups

of rows

Page 39: CM2020: Introduction to Database Systems Queries In Relational Databases Database Systems 4 th edition Connolly and Begg Chapter 5 Dr Nirmalie Wiratunga.

GROUP BYANSI syntax is:

All select columnsMust appear here

Specify conditions applicable to groups

SELECT {* | [column1 [AS newName]] [,column2, …]} [ALL | DISTINCT]

FROM table1 [alias] [, table2,…][WHERE "conditions"][GROUP BY "column-list"][HAVING "conditions"][ORDER BY "column-list" [ASC | DESC] ];

Page 40: CM2020: Introduction to Database Systems Queries In Relational Databases Database Systems 4 th edition Connolly and Begg Chapter 5 Dr Nirmalie Wiratunga.

Group By Clause ExampleGiven the following table:

Employees

NI_no name address salaryNS 111111 Fred Aberdeen 15,000NS 222222 Bobby Dundee 20,000NS 333333 Dave Aberdeen 12,000NS 444444 Steve Stirling 10,000NS 555555 Betty Dundee 25,000

Show, for each address, the number of employees that live there

SELECT address, count(NI_no)FROM EmployeesGROUP BY address;

Address count(NI_no)Aberdeen 2Dundee 2Stirling 1

Page 41: CM2020: Introduction to Database Systems Queries In Relational Databases Database Systems 4 th edition Connolly and Begg Chapter 5 Dr Nirmalie Wiratunga.

Group By Clause Example

In order to do the previous query, an intermediate table is created that is group by (sorted on) address i.e.:

Employees

NI_no name address salaryNS 111111 Fred Aberdeen 15,000NS 333333 Dave Aberdeen 12,000NS 222222 Bobby Dundee 20,000NS 555555 Betty Dundee 25,000NS 444444 Steve Stirling 10,000

We perform count(NI_no) for each of the 3 groupings

grouping 1

grouping 2

grouping 3

Page 42: CM2020: Introduction to Database Systems Queries In Relational Databases Database Systems 4 th edition Connolly and Begg Chapter 5 Dr Nirmalie Wiratunga.

Given the following table:

EmployeesNI_no name address salaryNS 111111 Fred Aberdeen 15,000NS 222222 Bobby Dundee 20,000NS 333333 Dave Aberdeen 12,000NS 444444 Steve Stirling 10,000NS 555555 Betty Dundee 25,000

Show the addresses whose employees’ average salary is greater than £21,000. Also show the average salary.

SELECT address, avg(salary)FROM EmployeesGROUP BY addressHAVING avg(salary) > 12000;

address avg(salary)Aberdeen 13500Dundee 22500

NOTE: compare with intermediate table from previous slide

GROUP BY with Having