Post on 16-Dec-2015
Slides adapted from Rao (ASU) & Franklin (Berkeley)
Functionality of a DBMS• Data Dictionary Management• Storage management
– Data storage Definition Language (DDL)• High level query and data manipulation language
– SQL/XQuery etc.– May tell us what we are missing in text-based search
• Efficient query processing– May change in the internet scenario
• Transaction processing• Resiliency: recovery from crashes,• Different views of the data, security
– May be useful to model a collection of databases together• Interface with programming languages
Slides adapted from Rao (ASU) & Franklin (Berkeley)
Building an Application with a Database System
• Requirements modeling (conceptual, pictures)– Decide what entities should be part of the application and
how they should be linked.• Schema design and implementation
– Decide on a set of tables, attributes.– Define the tables in the database system.– Populate database (insert tuples).
• Write application programs using the DBMS– Now much easier, with data management API
Slides adapted from Rao (ASU) & Franklin (Berkeley)
ssn
address name field
Professor
Advises
Takes
Teaches
CourseStudent
name category
quarter
name
Conceptual Modeling
Slides adapted from Rao (ASU) & Franklin (Berkeley)
Data Models• A data model is a collection of concepts for
describing data.
• A schema is a description of a particular collection of data, using a given data model.
• The relational model of data is the most widely used model today.– Main concept: relation, basically a table with rows and
columns.– Every relation has a schema, which describes the columns, or
fields.
Slides adapted from Rao (ASU) & Franklin (Berkeley)
Levels of Abstraction
• Views describe how users see the data.
• Conceptual schema
defines logical structure
• Physical schema describes the files and indexes used.
Physical Schema
Conceptual Schema
View 1 View 2 View 3
DB
Slides adapted from Rao (ASU) & Franklin (Berkeley)
Example: University Database
• Conceptual schema: – Students(sid: string, name: string,
login: string, age: integer, gpa:real)
– Courses(cid: string, cname:string, credits:integer)
• External Schema (View): – Course_info(cid:string,enrollment:in
teger)
• Physical schema:– Relations stored as unordered files. – Index on first column of Students.
Physical Schema
Conceptual Schema
View 1 View 2 View 3
DB
If five people are asked to come up with a schema for the data, what are the odds that they will come up with the same schema?
Slides adapted from Rao (ASU) & Franklin (Berkeley)
Data Independence• Applications insulated from
how data is structured and stored.
• Logical data independence: Protection from changes in logical structure of data.
• Physical data independence: Protection from changes in physical structure of data.
• Q: Why are these particularly important for DBMS?
Physical Schema
Conceptual Schema
View 1 View 2 View 3
DB
Slides adapted from Rao (ASU) & Franklin (Berkeley)
Schema Design & Implementation
• Table Students
• Separates the logical view from the physical view of the data.
Student Course Quarter
Charles CS 444 Fall, 1997
Dan CS 142 Winter,1998
… … …
Slides adapted from Rao (ASU) & Franklin (Berkeley)
Terminology
tuples
Attribute namesStudents
(Arity=3)
Student Course Quarter
Charles CS 444 Fall, 1997
Dan CS 142 Winter,1998
… … …
Slides adapted from Rao (ASU) & Franklin (Berkeley)
Querying a Database
• Find all the students taking CSE594 in Q1, 2004
• S(tructured) Q(uery) L(anguage)select E.namefrom Enroll Ewhere E.course=CS490i and E.quarter=“Winter, 2000”
• Query processor figures out how to answer the query efficiently.
Slides adapted from Rao (ASU) & Franklin (Berkeley)
Example: Projection Onto SSN, Name
EmployeeSSN Name DepartmentID Salary999999999 John 1 30,000777777777 Tony 1 32,000888888888 Alice 2 45,000
SSN Name999999999 John777777777 Tony888888888 Alice
Slides adapted from Rao (ASU) & Franklin (Berkeley)
Cartesian Product X
• Binary Operation• Result is set of tuples
combining all elements of R1 with all elements of R2, for R1 R2
• Schema is union of Schema(R1) & Schema(R2)
• Notice we could do selection on result to get meaningful info!
3/19/2001 12:13 PM 14Copyright © 2000 D.S.Weld (modified by Rao)
EmployeeName SSNJohn 999999999Tony 777777777DependentsEmployeeSSN Dname999999999 Emily777777777 Joe
Employee_DependentsName SSN EmployeeSSN DnameJohn 999999999 999999999 EmilyJohn 999999999 777777777 JoeTony 777777777 999999999 EmilyTony 777777777 777777777 Joe
Cartesian Product Example
Slides adapted from Rao (ASU) & Franklin (Berkeley)
EmployeeName SSNJohn 999999999Tony 777777777DependentsEmployeeSSN Dname999999999 Emily777777777 Joe
Employee_DependentsName SSN EmployeeSSN DnameJohn 999999999 999999999 EmilyJohn 999999999 777777777 JoeTony 777777777 999999999 EmilyTony 777777777 777777777 Joe
Cartesian Product Example
Slides adapted from Rao (ASU) & Franklin (Berkeley)
Join
• Most common (and exciting!) operator…• Combines 2 relations
– Selecting only related tuples• Result has all attributes of the two relations• Equivalent to
– Cross product followed by selection followed by Projection• Equijoin
– Join condition is equality between two attributes• Natural join
– Equijoin on attributes of same name– result has only one copy of join condition attribute
Slides adapted from Rao (ASU) & Franklin (Berkeley)
Example: Natural JoinEmployeeName SSNJohn 999999999Tony 777777777DependentsSSN Dname999999999 Emily777777777 Joe
Employee DependentsEmployee_DependentsName SSN DnameJohn 999999999 EmilyTony 777777777 Joe
Slides adapted from Rao (ASU) & Franklin (Berkeley)
Complex Queries
Product ( pname, price, category, maker)Purchase (buyer, seller, store, prodname)Company (cname, stock price, country)Person( per-name, phone number, city)
Find phone numbers of people who bought gizmos from Fred.
Find telephony products that somebody bought
Slides adapted from Rao (ASU) & Franklin (Berkeley)
Exercises
Product ( pname, price, category, maker)Purchase (buyer, seller, store, prodname)Company (cname, stock price, country)Person( per-name, phone number, city)
Ex #1: Find people who bought telephony products.Ex #2: Find names of people who bought American productsEx #3: Find names of people who bought American products and did not buy French productsEx #4: Find names of people who bought American products and they live in Seattle.Ex #5: Find people who bought stuff from Joe or bought products from a company whose stock prices is more than $50.
Slides adapted from Rao (ASU) & Franklin (Berkeley)
SQL Introduction
Standard language for querying and manipulating data
Structured Query Language
Many standards out there: SQL92, SQL2, SQL3, SQL99Vendors support various subsets of these
(but we’ll only discuss a subset of what they support)Basic form = syntax on relational algebra (but many other features too)
Select attributes From relations (possibly multiple, joined) Where conditions (selections)
Slides adapted from Rao (ASU) & Franklin (Berkeley)
Selections s
SELECT * FROM Company WHERE country=“USA” AND stockPrice > 50
You can use: Attribute names of the relation(s) used in the FROM. Comparison operators: =, <>, <, >, <=, >= Apply arithmetic operations: stockprice*2 Operations on strings (e.g., “||” for concatenation). Lexicographic order on strings. Pattern matching: s LIKE p Special stuff for comparing dates and times.
Slides adapted from Rao (ASU) & Franklin (Berkeley)
Projection p
SELECT name AS company, stockprice AS price FROM Company WHERE country=“USA” AND stockPrice > 50
SELECT name, stock price FROM Company WHERE country=“USA” AND stockPrice > 50
Select only a subset of the attributes
Rename the attributes in the resulting table
Slides adapted from Rao (ASU) & Franklin (Berkeley)
Ordering the Results
SELECT name, stock price FROM Company WHERE country=“USA” AND stockPrice > 50 ORDERBY country, name
Ordering is ascending, unless you specify the DESC keyword.
Ties are broken by the second attribute on the ORDERBY list, etc.
Slides adapted from Rao (ASU) & Franklin (Berkeley)
Join
SELECT name, store FROM Person, Purchase WHERE per-name=buyer AND city=“Seattle” AND product=“gizmo”
Product ( pname, price, category, maker)Purchase (buyer, seller, store, product)Company (cname, stock price, country)Person( per-name, phone number, city)
Slides adapted from Rao (ASU) & Franklin (Berkeley)
Tuple Variables
SELECT product1.maker, product2.maker FROM Product AS product1, Product AS product2 WHERE product1.category = product2.category AND product1.maker <> product2.maker
Product ( name, price, category, maker)
Find pairs of companies making products in the same category
Slides adapted from Rao (ASU) & Franklin (Berkeley)
Defining Views(Virtual) Views are relations, except that they are not physically stored.
They are used mostly in order to simplify complex queries andto define conceptually different views of the database to differentclasses of users.
View: purchases of telephony products:
CREATE VIEW telephony-purchases AS SELECT product, buyer, seller, store FROM Purchase, Product WHERE Purchase.product = Product.name AND Product.category = “telephony”
Slides adapted from Rao (ASU) & Franklin (Berkeley)
A Different ViewCREATE VIEW Seattle-view AS
SELECT buyer, seller, product, store FROM Person, Purchase WHERE Person.city = “Seattle” AND Person.name = Purchase.buyer
We can later use the views: SELECT name, store FROM Seattle-view, Product WHERE Seattle-view.product = Product.name AND Product.category = “shoes”
What’s really happening when we query a view??
Slides adapted from Rao (ASU) & Franklin (Berkeley)
Updating Views
How can I insert a tuple into a table that doesn’t exist?
CREATE VIEW bon-purchase AS SELECT store, seller, product FROM Purchase WHERE store = “The Bon Marche”
If we make the following insertion:
INSERT INTO bon-purchase VALUES (“the Bon Marche”, Joe, “Denby Mug”)
We can simply add a tuple (“the Bon Marche”, Joe, NULL, “Denby Mug”)to relation Purchase.
Slides adapted from Rao (ASU) & Franklin (Berkeley)
Non-Updatable Views
CREATE VIEW Seattle-view AS SELECT seller, product, store FROM Person, Purchase WHERE Person.city = “Seattle” AND Person.name = Purchase.buyer
How can we add the following tuple to the view?
(Joe, “Shoe Model 12345”, “Nine West”)
Given Purchase (buyer, seller, store, product) Person( name, phone-num, city)
Slides adapted from Rao (ASU) & Franklin (Berkeley)
Materialized Views• Views whose corresponding queries have been executed
and the data is stored in a separate database– Uses: Caching
• Issues– Using views in answering queries
• Normally, the views are available in addition to database– (so, views are local caches)
• In information integration, views may be the only things we have access to. – An internet source that specializes in woody allen movies can be seen as a view
on a database of all movies. Except, there is no database out there which contains all movies..
– Maintaining consistency of materialized views
Slides adapted from Rao (ASU) & Franklin (Berkeley)
Query Optimization
Imperative query execution plan:Declarative SQL query
Ideally: Want to find best plan. Practically: Avoid worst plans!
Goal:
(Simple Nested Loops)
Purchase Person
Buyer=name
City=‘seattle’ phone>’5430000’
buyer
(Table scan) (Index scan)
SELECT S.buyerFROM Purchase P, Person QWHERE P.buyer=Q.name AND Q.city=‘seattle’ AND Q.phone > ‘5430000’
Inputs:• the query• statistics about the
data (indexes, cardinalities, selectivity factors)
• available memory