Relational Algebra CS302 Week 6 Slides
description
Transcript of Relational Algebra CS302 Week 6 Slides
Relational AlgebraCS302 Week 6 Slides
SQL is the language used for talking to a relational database management system Allows changing the schema (“data definition”),
changing the data (“data modification”), and querying
But the querying part of SQL isn't very good Confusing syntax that is hard to read Major differences from one DBMS to the next
We teach you relational algebra (RA) for querying We use a graphical representation of RA It's easy to convert RA into SQL by hand We'll use GRAPE to automatically convert RA into SQL
too
A relational database is a set of relations A relation is a set of rows and a set of columns A relation has an identifier which is a set of
columns We consistently underline columns in the identifier
Rules: A row has a value for each column in the identifier
(A row may or may not have values for columns that are not in the identifier)
No two rows have the same values in all the identifying columns
Is this a relation?
First name Last name Award Movie
Jeff Bridges Best Actor Crazy Heart
Sandra Bullock Best Actress The Blind Side
Kathryn Bigelow Director The Hurt Locker
Is this a relation?
First name Last name Award Movie
Jeff Bridges Best Actor Crazy Heart
Sandra Bullock Best Actress The Blind Side
Kathryn Bigelow Director The Hurt Locker
Is this a relation?
First name Last name Award Movie
Jeff Bridges Best Actor Crazy Heart
Sandra Bullock Best Actress The Blind Side
Kathryn Bigelow Director The Hurt Locker
Paul Ottosson Sound Editing The Hurt Locker
Paul Ottosson Sound Mixing The Hurt Locker
Is this a relation?
First name Last name Award Movie
Jeff Bridges Best Actor Crazy Heart
Sandra Bullock Best Actress The Blind Side
Kathryn Bigelow Director The Hurt Locker
Paul Ottosson Sound Mixing The Hurt Locker
Paul Ottosson Sound Mixing Spider Man
Are these two relations the same?
Last name Award Movie First name
Bigelow Director The Hurt Locker
Kathryn
Bridges Best Actor Crazy Heart Jeff
Bullock Best Actress The Blind Side Sandra
First name Last name Award Movie
Jeff Bridges Best Actor Crazy Heart
Sandra Bullock Best Actress The Blind Side
Kathryn Bigelow Director The Hurt Locker
RA Operators We'll look at unary operators first
These take a relation and return a relation
project filter
reduce group
(Unary Operators, p. 72, examples 1-8 work on one relation)
Project Project removes and adds non-
identifying columns Adds computed columns –
arithmetic, string manipulation, etc.
result: as wide as number of columns named,
which must include the identifying columns exactly as tall as original relation identifier is same
see helper table, p. 74, Unary Ops : note where above info is
SQL: Select columns from relation;
Project exampleFirst name Last name Award Movie Birth year
Jeff Bridges Best Actor Crazy Heart 1949
Sandra Bullock Best Actress The Blind Side 1964
Kathryn Bigelow Director The Hurt Locker 1951
Filter Filter removes rows based on a condition result:
as wide as the original relation shorter or exactly as tall as the original relation identifier is same
see helper table, p. 74, Unary Ops : note where above info is
SQL: Select * from relation where condition;
Filter example
First name Last name Award Movie Birth year
Jeff Bridges Best Actor Crazy Heart 1949
Sandra Bullock Best Actress The Blind Side 1964
Kathryn Bigelow Director The Hurt Locker 1951
What algorithm do you think Filter uses?
What algorithm do you think Filter uses? Trick question: depends on the DBMS and the
configuration!
Reduce changes the identifier, result:
narrower or the same width as the original relation possibly shorter than the original relation unsafe!
see helper table, p. 74, Unary Ops : note where above info is
SQL: Select distinct new_identifier from old_identifier
Note: Our book also allows reduce with carried non-identifying columns, but this is trickier to do in SQL
Reduce
Reduce example
First name Last name Award Movie
Jeff Bridges Best Actor Crazy Heart
Sandra Bullock Best Actress The Blind Side
Kathryn Bigelow Director The Hurt Locker
Paul Ottosson Sound Editing The Hurt Locker
Paul Ottosson Sound Mixing The Hurt Locker
Group Group changes the identifier, combining duplicate
rows using an aggregate function result:
narrower, the same width, or wider than the original relation possibly shorter than the original relation unsafe!
SQL: select new_identifier, aggregate_function from relation
group by new_identifier Select distinct columns from relation;
Group example
First name Last name Award Movie Birth year
Jeff Bridges Best Actor Crazy Heart 1949
Sandra Bullock Best Actress The Blind Side 1964
Kathryn Bigelow Director The Hurt Locker 1951
Group example 2
First name Last name Award Birth year
Jeff Bridges Best Actor 1949
Sean Penn Best Actor 1960
Sandra Bullock Best Actress 1964
Kate Winslet Best Actress 1975
Getting started with GRAPE Start “X Server” ssh -YC [email protected] grape Log in:
Usernames s10carlis1 – s10carlis15 Passwords are the same Pick an account and change the password please
Start the SQL Shell and load in some data! Web site about GRAPE: csrg.cs.umn.edu/grape Carlis's Tiny DB is on that site as well
Your Homework Go to sqlzoo.net
If you have not used relational databases before, do Tutorials 1-4
If you have used relational databases before: start where you're comfortable and try to finish all the tutorials
Install GRAPE on your personal machine if possible