Databases - Brown Universitycs.brown.edu/courses/cs008/2014/files/lectures/L06databases.pdf · What...

18
CS8: A First Byte of Computer Science Prof. Michael Littman Databases

Transcript of Databases - Brown Universitycs.brown.edu/courses/cs008/2014/files/lectures/L06databases.pdf · What...

Page 1: Databases - Brown Universitycs.brown.edu/courses/cs008/2014/files/lectures/L06databases.pdf · What Billy Preston song had the shortest number of weeks as #1? SQL: select first(TITLE)

CS8: A First Byte of Computer ScienceProf. Michael Littman

Databases

Page 2: Databases - Brown Universitycs.brown.edu/courses/cs008/2014/files/lectures/L06databases.pdf · What Billy Preston song had the shortest number of weeks as #1? SQL: select first(TITLE)

Background on Databases๏ Major importance in

- banking - information retrieval - online commerce - government organizations - etc.

Page 3: Databases - Brown Universitycs.brown.edu/courses/cs008/2014/files/lectures/L06databases.pdf · What Billy Preston song had the shortest number of weeks as #1? SQL: select first(TITLE)

Querying a DatabaseOur good friends and/or/not come back. 1. Start with a question we want to answer 2. Decide on a strategy 3. Express it in a computer-readable form

Page 4: Databases - Brown Universitycs.brown.edu/courses/cs008/2014/files/lectures/L06databases.pdf · What Billy Preston song had the shortest number of weeks as #1? SQL: select first(TITLE)

Query Example 1In what years have the Bee Gees had a #1 hit in Week 33?

SQL:

!Scratch:

!

select YEAR from TOPSONGS where ARTIST = “Bee Gees” and WEEK = 33;

Page 5: Databases - Brown Universitycs.brown.edu/courses/cs008/2014/files/lectures/L06databases.pdf · What Billy Preston song had the shortest number of weeks as #1? SQL: select first(TITLE)

Query Example 1In what years have the Bee Gees had a #1 hit in Week 33?

UNIX:

!Excel:

1. Sort by artist, go to the Bee Gees (via Control-F)

2. Sort by week within Bee Gees range, go to Week 3 (via Control-F)

3. Read off the answer.

cat 60s70s.txt | egrep '^[^ ]* 33 [^ ]* "Bee Gees"' | cut -f1

Page 6: Databases - Brown Universitycs.brown.edu/courses/cs008/2014/files/lectures/L06databases.pdf · What Billy Preston song had the shortest number of weeks as #1? SQL: select first(TITLE)

Query Example 2What Billy Preston song had the shortest number of weeks as #1?

SQL: select first(TITLE) FROM (select TITLE, count(TITLE) as WEEK_COUNT from TOPSONGS where ARTIST = "Billy Preston" group by TITLE order by WEEK_COUNT asc);

Page 7: Databases - Brown Universitycs.brown.edu/courses/cs008/2014/files/lectures/L06databases.pdf · What Billy Preston song had the shortest number of weeks as #1? SQL: select first(TITLE)

Query Example 2What Billy Preston song had the shortest number of weeks as #1?

Scratch:

!

Page 8: Databases - Brown Universitycs.brown.edu/courses/cs008/2014/files/lectures/L06databases.pdf · What Billy Preston song had the shortest number of weeks as #1? SQL: select first(TITLE)

Query Example 2What Billy Preston song had the shortest number of weeks as #1?

UNIX:

!Excel:

1. Sort by artist, go to Billy Preston (via Control-F)

2. ??

cat 60s70s.txt | egrep '[^ ]* [^ ]* [^ ]* "Billy Preston"' | cut -f3 | sort | uniq -c | sort -n | head -1

Page 9: Databases - Brown Universitycs.brown.edu/courses/cs008/2014/files/lectures/L06databases.pdf · What Billy Preston song had the shortest number of weeks as #1? SQL: select first(TITLE)

UNIX Pipeline1960 1 "El Paso" "Marty Robbins" 1960 2 "El Paso" "Marty Robbins" 1960 3 "Running Bear" "Johnny Preston" . . .

cat 60s70s.txt

1973 27 "Will It Go Round in Circles" "Billy Preston" 1973 28 "Will It Go Round in Circles" "Billy Preston" 1974 42 "Nothing From Nothing" "Billy Preston"

egrep '[^ ]* [^ ]* [^ ]* "Billy Preston"'

"Will It Go Round in Circles" "Will It Go Round in Circles" "Nothing From Nothing"

cut -f3

"Nothing From Nothing" "Will It Go Round in Circles" "Will It Go Round in Circles"

sort

1 "Nothing From Nothing" 2 "Will It Go Round in Circles"

uniq -c

output entire database

keep only Billy

keep just the titles

sort

count (adjacent) duplicates

1 "Nothing From Nothing" 2 "Will It Go Round in Circles"

sort -n

sort (smallest first)head -1 1 "Nothing From Nothing" keep only the smallest

Page 10: Databases - Brown Universitycs.brown.edu/courses/cs008/2014/files/lectures/L06databases.pdf · What Billy Preston song had the shortest number of weeks as #1? SQL: select first(TITLE)

What Does This One Do?A. Returns the name of the

artist who recorded “You Light Up My Life”.

B. Returns the number of weeks “You Light Up My Life” was the #1 hit song.

C. Returns “1” if “You Light Up My Life” was ever a #1 song, “0” otherwise.

D. Returns the position in the database where “You Light Up My Life” first appears.

E. Returns all the positions in the database where “You Light Up My Life” appears.

Page 11: Databases - Brown Universitycs.brown.edu/courses/cs008/2014/files/lectures/L06databases.pdf · What Billy Preston song had the shortest number of weeks as #1? SQL: select first(TITLE)

Tables๏ Most databases store information in tables, like our

number1hits database. - A row describes a single data record in terms of

all its attributes. - A column is the attribute value for all the data

records.

Page 12: Databases - Brown Universitycs.brown.edu/courses/cs008/2014/files/lectures/L06databases.pdf · What Billy Preston song had the shortest number of weeks as #1? SQL: select first(TITLE)

Tables of Tables๏ Attributes?

- picture - sale or not? - list price - sale price - shipping info - ratings - more options - name - description

Page 13: Databases - Brown Universitycs.brown.edu/courses/cs008/2014/files/lectures/L06databases.pdf · What Billy Preston song had the shortest number of weeks as #1? SQL: select first(TITLE)

Changing Tables๏ We’ve looked at querying the database, but what

about modifying them? That’s done through transactions, like - add a row

• new item is available for sale - delete a row

• an item is no longer available - change an element in a row

• an attribute changes (new price)

Page 14: Databases - Brown Universitycs.brown.edu/courses/cs008/2014/files/lectures/L06databases.pdf · What Billy Preston song had the shortest number of weeks as #1? SQL: select first(TITLE)

Messed Up Table๏ How do we make sure a transaction don’t mess up

the database permanently? - Free shipping on items over $400. - Change price of item - change shipping information.

butler $419 free shipping

butler $399 free shipping

butler $399 shipping costs

Page 15: Databases - Brown Universitycs.brown.edu/courses/cs008/2014/files/lectures/L06databases.pdf · What Billy Preston song had the shortest number of weeks as #1? SQL: select first(TITLE)

Write-Ahead Logging๏ Before messing with the

database, write down what you are going to do in a safe place (a “log” stored on disk).

๏ Make sure each operation is “idempotent” (can be repeated without changing things).

๏ If a crash happens, rerun all the operations in the log. Can’t hurt!

Page 16: Databases - Brown Universitycs.brown.edu/courses/cs008/2014/files/lectures/L06databases.pdf · What Billy Preston song had the shortest number of weeks as #1? SQL: select first(TITLE)

Fixing Up The Table๏ Transaction details:

- change butler price from $419 to $399 - change butler shipping from free - end transaction

butler $419 free shipping

butler $399 free shipping

butler $399 shipping costs

Page 17: Databases - Brown Universitycs.brown.edu/courses/cs008/2014/files/lectures/L06databases.pdf · What Billy Preston song had the shortest number of weeks as #1? SQL: select first(TITLE)

Prepare Then Commit๏ What happens if we can’t complete a transaction? ๏ Roll-back/undo: make sure each step in a

transaction can be undone. ๏ If we have multiple copies of our database and one

can’t complete, others need to know so the transaction fails as opposed to leaving the replicas unsynchronized.

Page 18: Databases - Brown Universitycs.brown.edu/courses/cs008/2014/files/lectures/L06databases.pdf · What Billy Preston song had the shortest number of weeks as #1? SQL: select first(TITLE)

The ACID TestA set of properties that guarantee that database transactions are processed reliably. ๏ Atomicity: Transactions are "all or nothing” (no partial

changes). ๏ Consistency: All transactions bring the database from one

valid state to another. ๏ Isolation: Concurrent execution of transactions results in a

system state that would be obtained if transactions were executed serially.

๏ Durability: Completed transactions remain so, even in the event of power loss, crashes, or errors.