1 Lecture 11: Basic SQL, Integrity constraints

24
1 Lecture 11: Basic SQL, Integrity constraints www.cl.cam.ac.uk/Teaching/current/ Databases/

Transcript of 1 Lecture 11: Basic SQL, Integrity constraints

1

Lecture 11:Basic SQL,

Integrity constraints

www.cl.cam.ac.uk/Teaching/current/Databases/

2

SQL – A language for relational databases

• Originates from System R project at IBM

• So widespread it has been standardised– SQL-86 (ANSI)– SQL-89 (minor revisions)– SQL-92 (major revision ANSI/ISO: current

standard)– SQL:1999 (major extension - objects)– SQL:2003 (more features and XML support)

3

SQL components

• SQL/92 is a huge standard (600+ pages)

• SQL is a comprehensive database language, containing– Data definition language (DDL): to define

schemas– Data manipulation language (DML): to insert,

delete, modify and retrieve data– Lots of other features: e.g. views, schema

evolution, access control, triggers, assertions, …

4

Relation instances

sid bid day

22 101 101001

55 103 111201

sid sname rating age

11 Sue 7 26

22 Tim 7 26

33 Bob 8 28

55 Kim 10 28

Reserves

Sailors

bid bname colour

101 Clipper Red

102 Hatter Blue

103 Interlake green

Boats

5

Simple queries

• Contents of entire relation SELECT * FROM Sailors;

• Qualifications SELECT * FROM Sailors WHERE age>22;

6

Simple queries cont.

• Projection of attributes SELECT rating,age FROM Sailors;

• Note that SQL does not remove duplicates. We need to request this explicitly: SELECT DISTINCT rating,age FROM Sailors;

7

Simple queries cont.

• Can query across multiple relations: SELECT S.sname FROM Sailors S, Reserves R WHERE S.sid=R.sid AND R.bid=103;

• SQL allows the range variables to be dropped, e.g. SELECT sname FROM Sailors,Reserves WHERE Sailors.sid=Reserves.sid AND Reserves.bid=103;

Range variables

8

Basic SQL query syntax

SELECT [DISTINCT] target-list FROM relation-list WHERE qualification

• relation-list is a list of relation names (possibly with range variables)

• target-list is a list of attributes of relations in relation-list

• qualification is a Boolean expression

9

Semantics of SQL queries

• The semantics of SQL query evaluation can be conceptually defined as:– Compute the product of relation-list– Discard those tuples that fail the qualification– Delete attributes that are not in target-list– If DISTINCT is specified, eliminate duplicate

rows

• NOTE: This is probably the least efficient way to evaluate a query!

10

Compiling to relational algebra

• The query SELECT a1, …, an

FROM R1, …, Rm

WHERE b

• Can be compiled to the RA expression:

a1,…,an(b(R1…Rm))

(Assuming no duplicates)

11

Example

Find sailors who’ve reserved at least one boat

SELECT S.sid FROM Sailors S, Reserves R WHERE S.sid=R.sid

• Would adding DISTINCT to this query make any difference?

12

String matching

• SQL provides powerful string matching facilities

• These can be used in the WHERE clause– _ denotes any one character– % denotes zero or more characters

• For example: SELECT * FROM Sailors WHERE sname LIKE ‘_ic%’;

13

Arithmetic

• SQL provides arithmetic operators, which can be used in the SELECT clause

• Use AS to name columns SELECT sname, age*5 AS OldAge FROM Sailors;

• Can also be used in the WHERE clause SELECT S1.sname FROM Sailors S1, Sailors S2 WHERE S1.rating=2*S2.rating –4;

14

Set theoretic operations

• SQL provides the familiar set theoretic operators: UNION, INTERSECT, IN (set membership)

• SQL also allows a WHERE clause to contain another query, e.g. SELECT S.sname FROM Sailors S WHERE S.sid IN (SELECT R.sid FROM Reserves R WHERE R.bid=103);

Nested Query!

15

Example

Find names of sailors who’ve reserved a red or a green boat

SELECT S.snameFROM Sailors S, Reserves R, Boats BWHERE S.sid=R.sid AND R.bid=B.bid AND (B.colour=“red” OR B.colour=“green”);

16

Example

Find names of sailors who’ve reserved a red and a green boat

SELECT S.snameFROM Sailors S, Reserves R1, Reserves R2, Boats B1, Boats B2WHERE S.sid=R1.sid AND R1.bid=B1.bid AND S.sid=R2.sid AND R2.bid=B2.bid AND B1.colour=“red” AND B2.colour=“green”;

17

Integrity constraints

• An integrity constraint (IC) is a condition that must be true for any instance of the database – E.g. domain constraints– ICs are specified when schema is defined– ICs are checked when relations are modified

• A legal instance of a relation is one that satisfies all the specified ICs

• We’d like the DBMS to check for ICs

18

Primary key constraints

• A set of fields is a (candidate) key for a relation if1. No two distinct tuples can have same values in all

the key fields; and

2. This is not true for any subset of the key

• If part 2 is false, then it’s a superkey• If there’s more than one key for a relation, then

one is chosen by the DBA to be the primary key

19

Primary and candidate keys in SQL

• Can define possibly many candidate keys, one of which is chosen as the primary key CREATE TABLE Enrolled ( sid CHAR(20), name CHAR(20), age INTEGER, grade CHAR(2), PRIMARY KEY (sid), UNIQUE (name,age));

Candidatekey

20

Foreign key

• A foreign key is a set of fields in one relation that is used to refer to a tuple in another relation (must correspond to a primary key of the second relation)– A bit like a pointer

• For example, sid in Reserves is a foreign key referring to Sailors

21

Foreign keys in SQL

• Only sailors listed in the Sailors relation should be allowed to make reservations

CREATE TABLE Reserves (sid INTEGER, bid INTEGER, date DATE, PRIMARY KEY (sid,bid,date), FOREIGN KEY (sid) REFERENCES Sailors);

22

Referential integrity

• If all foreign key constraints are enforced then referential integrity is achieved. – Similar to `no dangling pointers’

• Can you name a data model without referential integrity?

23

Semantic integrity constraints

• ICs come from the semantics of the real-world enterprise that is being modelled

• Another class of ICs is semantic integrity constraints. These apply to the actual data values that can be stored, e.g. “Sailor rating is between 0 and 10”

24

Summary

You should now understand:• Simple SQL queries

– SELECT-FROM-WHERE

• Compilation to relational algebra• Nested queries• Integrity constraints• Primary and candidate keys