LIS512 lecture 2 relational databases
description
Transcript of LIS512 lecture 2 relational databases
![Page 1: LIS512 lecture 2 relational databases](https://reader035.fdocuments.us/reader035/viewer/2022081520/5681692e550346895de07259/html5/thumbnails/1.jpg)
LIS512 lecture 2
relational databases
Thomas Krichel2010-02-03
![Page 2: LIS512 lecture 2 relational databases](https://reader035.fdocuments.us/reader035/viewer/2022081520/5681692e550346895de07259/html5/thumbnails/2.jpg)
relational databases
• A relational database is a set of tables. There may be relations between the tables.
• Each table has a number of records. Each record has a number of fields.
• Let us look at this bottom up.
![Page 3: LIS512 lecture 2 relational databases](https://reader035.fdocuments.us/reader035/viewer/2022081520/5681692e550346895de07259/html5/thumbnails/3.jpg)
the entity
• We saw last time that an entity is something we want to gather information about.
• An entity may be a prime focus of interest, like group 1 entities in FRBR.
• Or an entity may be only interesting in an entity because they have a relationship to entities that are the prime focus, like FRBR group 2 and 3 entities.
![Page 4: LIS512 lecture 2 relational databases](https://reader035.fdocuments.us/reader035/viewer/2022081520/5681692e550346895de07259/html5/thumbnails/4.jpg)
entity set and entity
• In fact the talk in FRBR is a bit loose.• When FRBR talks about entities, it really
means entity sets.• FRBR say that work is an entity. It really means
that are workS, and each work is an something of interest.
• “work” is an conceptual entity set, and each work is an entity.
![Page 5: LIS512 lecture 2 relational databases](https://reader035.fdocuments.us/reader035/viewer/2022081520/5681692e550346895de07259/html5/thumbnails/5.jpg)
attributes
• An entity has attributes.• These are things we want to know about. • Say a person has a name, birthday, height and
weight.• We can use that data to find out of the person
is overweight.
![Page 6: LIS512 lecture 2 relational databases](https://reader035.fdocuments.us/reader035/viewer/2022081520/5681692e550346895de07259/html5/thumbnails/6.jpg)
records
• A record is a collection of data elements and their values that pertain to one entity that is being described. – Name Thomas Krichel– Birthday 1965-06-05
• Data element names can also be called “attribute names” or “field name”. These terms are pretty much synonymous.
![Page 7: LIS512 lecture 2 relational databases](https://reader035.fdocuments.us/reader035/viewer/2022081520/5681692e550346895de07259/html5/thumbnails/7.jpg)
record syntax
• The way to write out a record (it’s syntax) may vary.
• Attribute: value syntax– Name: Thomas Krichel– Birthday: 1965-06-05
• XML syntax– <name>Thomas Krichel</name>– <birthday>1965-06-05</birthday>
![Page 8: LIS512 lecture 2 relational databases](https://reader035.fdocuments.us/reader035/viewer/2022081520/5681692e550346895de07259/html5/thumbnails/8.jpg)
tables
• In a database, records are represented as lines in a table. The first line of the table lists the field names. Here is a sample table with two records
|name | birthday | | Thomas Krichel| 1965-06-05| | Karl Marx | 1818-05-13|
![Page 9: LIS512 lecture 2 relational databases](https://reader035.fdocuments.us/reader035/viewer/2022081520/5681692e550346895de07259/html5/thumbnails/9.jpg)
identity
• Consider the following table |name | birthday | | Thomas Krichel | 1965-06-05| | Thomas Krichel | 1965-06-05|• Is the record a duplicate?• Without further information, we can not tell.
![Page 10: LIS512 lecture 2 relational databases](https://reader035.fdocuments.us/reader035/viewer/2022081520/5681692e550346895de07259/html5/thumbnails/10.jpg)
key
• To be able to find what records describe the same entities, databases use a construct called the key.
• The key for each record must have a different value.
• The name of the key field is not important, but it is important that the field is a key.
![Page 11: LIS512 lecture 2 relational databases](https://reader035.fdocuments.us/reader035/viewer/2022081520/5681692e550346895de07259/html5/thumbnails/11.jpg)
examples
• Here is a correct example|key|name | birthday ||2 | Thomas Krichel | 1965-06-05|| 1 | Thomas Krichel | 1965-06-05|
• Here is an incorrect example|key|name | birthday ||1 | Thomas Krichel | 1965-06-05|| 1 | Thomas Krichel | 1965-06-05|
![Page 12: LIS512 lecture 2 relational databases](https://reader035.fdocuments.us/reader035/viewer/2022081520/5681692e550346895de07259/html5/thumbnails/12.jpg)
identity
• Identity is simple when it comes to person.• But when it comes to other entities we are
interested in the FBRB world.• Identifiers and the handling of them is one of
the most difficult issues of digital librarianship.
![Page 13: LIS512 lecture 2 relational databases](https://reader035.fdocuments.us/reader035/viewer/2022081520/5681692e550346895de07259/html5/thumbnails/13.jpg)
identifier schemes
• There are many identifier schemes for group 1 entities– ISBN – ISSN– DOI – URL
• Some of them may cover different manifestations / expression.
• i.e. is the ISBN for a print book the same as for the ebook?
![Page 14: LIS512 lecture 2 relational databases](https://reader035.fdocuments.us/reader035/viewer/2022081520/5681692e550346895de07259/html5/thumbnails/14.jpg)
relations
• Not all database system are relational.• The most widely used ones are relational.• That means you can build relationships
between tables and enforce them. • I am coming to this from an old example.
![Page 15: LIS512 lecture 2 relational databases](https://reader035.fdocuments.us/reader035/viewer/2022081520/5681692e550346895de07259/html5/thumbnails/15.jpg)
example: Movie databaseID | title | director | dateM1 | Gone with the wind | F. Ford Coppola | 1963M2 | Room with a view | Coppola, F Ford | 1985M3 | High Noon | Woody Allan | 1974M4 | Star Wars | Steve Spielberg | 1993M5 | Alien | Allen, Woody | 1987M6 | Blowing in the Wind | Spielberg, Steven | 1962
• Single table• No relations between tables, of course
![Page 16: LIS512 lecture 2 relational databases](https://reader035.fdocuments.us/reader035/viewer/2022081520/5681692e550346895de07259/html5/thumbnails/16.jpg)
problem with this database
• All data wrong, but this is just for illustration.• Name covered inconsistently. There is no way
to find films by Woody Allan without having to go through all spelling variations.
• Mistakes are difficult to correct. We have to wade through all records, a masochist’s pleasure.
![Page 17: LIS512 lecture 2 relational databases](https://reader035.fdocuments.us/reader035/viewer/2022081520/5681692e550346895de07259/html5/thumbnails/17.jpg)
Better movie databaseID | title |director | yearM1 | Gone with the wind | D1 | 1963M2 | Room with a view | D1 | 1985M3 | High Noon | D2 | 1974M4 | Star Wars | D3 | 1993M5 | Alien | D2 | 1987M6 | Blowing in the Wind | D3 | 1962
ID |director name | birth yearD1 | Ford Coppola, Francis | 1942D2 | Allan, Woody | 1957D3 | Spielberg, Steven | 1942
![Page 18: LIS512 lecture 2 relational databases](https://reader035.fdocuments.us/reader035/viewer/2022081520/5681692e550346895de07259/html5/thumbnails/18.jpg)
Relational database
• We have a one to many relationship between directors and film– Each film has one director– Each director has produced many films
• Here it becomes possible for the computer– To know which films have been directed by Woody
Allen– To find which films have been directed by a
director born in 1942
![Page 19: LIS512 lecture 2 relational databases](https://reader035.fdocuments.us/reader035/viewer/2022081520/5681692e550346895de07259/html5/thumbnails/19.jpg)
enforcing relationships
• Relational database software has ways to enforce relationships.
• So when you change the record of M5 to say it was directed by director D9, it will complain that no such director has been defined.
![Page 20: LIS512 lecture 2 relational databases](https://reader035.fdocuments.us/reader035/viewer/2022081520/5681692e550346895de07259/html5/thumbnails/20.jpg)
Many-to-many relationships
• Each film has one director, but many actors star in it. Relationship between actors and films is a many to many relationship.
• Here are a few actorsID | sex | actor name | birth yearA1 | f | Brigitte Bardot | 1972A2 | m | George Clooney | 1927A3 | f | Marilyn Monroe | 1934
![Page 21: LIS512 lecture 2 relational databases](https://reader035.fdocuments.us/reader035/viewer/2022081520/5681692e550346895de07259/html5/thumbnails/21.jpg)
Actor/Movie table
actor id | movie idA1 | M4A2 | M3A3 | M2A1 | M5A1 | M3A2 | M6A3 | M4
… as many lines as required
![Page 22: LIS512 lecture 2 relational databases](https://reader035.fdocuments.us/reader035/viewer/2022081520/5681692e550346895de07259/html5/thumbnails/22.jpg)
relational databases
• Relational databases are powerful within organizations that have a relatively centralized command and control structure, i.e. within a company and / or a government department.
• The relational database model has problems when we are working in a coordinated fashion bet
• Imagine the web working an a relational database model!
![Page 23: LIS512 lecture 2 relational databases](https://reader035.fdocuments.us/reader035/viewer/2022081520/5681692e550346895de07259/html5/thumbnails/23.jpg)
FRBR
• FRBR really brings the way relational databases work to the library world.
• But here is the problem: how to implement this. • Imagine a totally FRBRized, relational database
library look like.• It would be fine if there is one Big Library
Agency (BLA). • What would BLA do?
![Page 24: LIS512 lecture 2 relational databases](https://reader035.fdocuments.us/reader035/viewer/2022081520/5681692e550346895de07259/html5/thumbnails/24.jpg)
BLA job
• At a starting BLA would first have to register all the works, and say what a work is.
• That would include maintaining catalogs of composite works.
• This job is problematic because of – the vague nature of the work– the scale of the job (who will pay)
![Page 25: LIS512 lecture 2 relational databases](https://reader035.fdocuments.us/reader035/viewer/2022081520/5681692e550346895de07259/html5/thumbnails/25.jpg)
currently
• Libraries purchase items.• They can look up manifestation data from
providers such as OCLC.• Finding out what expression this represents is
more difficult.• Linking it to a work is almost impossible. One
would need BLA for that.
![Page 26: LIS512 lecture 2 relational databases](https://reader035.fdocuments.us/reader035/viewer/2022081520/5681692e550346895de07259/html5/thumbnails/26.jpg)
a web decentralized digital library
• Something that computer scientists dream ofhttp://www.youtube.com/watch?v=OM6XIICm_qo• He completely avoids the idea of the meaning of
data.• In a centralized organization, the meaning can be
dictated.• In a decentralized world, the meaning as to be
agreed upon.
![Page 27: LIS512 lecture 2 relational databases](https://reader035.fdocuments.us/reader035/viewer/2022081520/5681692e550346895de07259/html5/thumbnails/27.jpg)
outlook
• In the next few weeks, we are looking at emerging ways to try to find some agreement– character set– record format
• Then we are looking at library related standards doing similar things.
![Page 28: LIS512 lecture 2 relational databases](https://reader035.fdocuments.us/reader035/viewer/2022081520/5681692e550346895de07259/html5/thumbnails/28.jpg)
http://openlib.org/home/krichel
Thank you for your attention!
Please switch off machines b4 leaving!