01 Introduction to Databases (Sept 10)

25
CS 338: Computer Applications in Business: Databases (Fall 2014) 1 ©1992-2014 by Addison Wesley & Pearson Education, Inc., McGraw-Hill. Some material adapted and modified from Fundamentals of Database Systems (Elmasri et al.) CS 338: Computer Applications in Business: Databases Fall 2014 1 welcome to Eyhab Al-Masri Course Information 2 Term Class No. Section Lectures Location Lectures Time Fall 2014 5712 LEC 001 DWE 1501 Wednesdays 10:00 am - 11:20 am Fridays 10:00 am - 11:20 am Instructor Name Office Email Office Hours Eyhab Al-Masri, PhD DC 2555B [email protected] Wednesday 11:40 am – 1:00 pm TA Name Office Email Office Hours Mina Farid TBA [email protected] TBA Abhishek Singhi TBA [email protected] TBA

Transcript of 01 Introduction to Databases (Sept 10)

  • CS 338: Computer Applications in Business: Databases (Fall 2014)

    11992-2014 by Addison Wesley & Pearson Education, Inc., McGraw-Hill. Somematerial adapted and modified from Fundamentals of Database Systems (Elmasri et al.)

    CS 338: Computer Applications in Business: DatabasesFall 2014

    1

    welcome to

    Eyhab Al-Masri

    Course Information

    2

    Term Class No. Section Lectures Location Lectures Time

    Fall 2014 5712 LEC 001 DWE 1501 Wednesdays 10:00 am - 11:20 amFridays 10:00 am - 11:20 am

    Instructor Name Office Email Office Hours

    Eyhab Al-Masri, PhD DC 2555B [email protected] Wednesday 11:40 am 1:00 pm

    TA Name Office Email Office Hours

    Mina Farid TBA [email protected] TBA

    Abhishek Singhi TBA [email protected] TBA

  • CS 338: Computer Applications in Business: Databases (Fall 2014)

    21992-2014 by Addison Wesley & Pearson Education, Inc., McGraw-Hill. Somematerial adapted and modified from Fundamentals of Database Systems (Elmasri et al.)

    about this course

    Course Website UW LEARN Management System http://learn.uwaterloo.ca

    Textbook Fundamentals of Database Systems

    Authors: Ramez Elmasri & Shamkant Navathe Publisher: Addison-Wesley (2010) 6th Edition (5th edition may also work) ISBN: 0136086209

    3

    Course Evaluation

    4

    First midterm exam (Oct 10, in class) 25%

    Second midterm exam (Nov 12, in class) 25%

    Final Exam (TBA) 50%

    Clicker Correctness [bonus factor of 4%] 4%

  • CS 338: Computer Applications in Business: Databases (Fall 2014)

    31992-2014 by Addison Wesley & Pearson Education, Inc., McGraw-Hill. Somematerial adapted and modified from Fundamentals of Database Systems (Elmasri et al.)

    Course Content

    5

    Introduction to database systems

    Relational data model

    SQL (ad hoc queries)

    Relational algebra

    Views and view management

    Entity-Relationship (ER) model

    Extended ER model

    Mapping ER models to relational

    DBMS architecture

    Transactions

    Database security and privacy

    Distributed databases

    Data warehouses

    Data analytics

    Relational database principles

    Data modeling

    DBMS functionality

    Related topics

    Assignments

    Three/four assignments throughout the term Sample solutions released on due date

    Goal is to give you practice with material in orderto provide self-assessment and guidance

    Assignment performance not part of evaluation You need not polish a submission You can work alone or with others You can seek help from TAs

    You will have more trouble learning the material (and passing the course) if you do not attempt the assignments

    6

  • CS 338: Computer Applications in Business: Databases (Fall 2014)

    41992-2014 by Addison Wesley & Pearson Education, Inc., McGraw-Hill. Somematerial adapted and modified from Fundamentals of Database Systems (Elmasri et al.)

    Clickers

    Wireless student response system

    Active learning lasts longer than passive listening

    Clicker questions allow you to show me what you understandwithout having to raise your hand and identify yourself

    You will get some credit up to 4% bonus for correct answers (based on best 80%)

    Bookstore ~$42 (return for ~$20)

    Used $30 when available

    Be sure to register your clicker here:http://www.student.cs.uwaterloo.ca/~pkates/uw-clicker.html

    7Image Source: http://www1.iclicker.com/purchase-response-devices

    Using Your Clicker

    Turn it on:Press the ON/OFF button. A solid blue light should appear next to the top Power button. If your clicker came wrapped in packaging, pull out the small plastic tab on the back to activate the batteries.

    Change the frequency to AA:The instructions on the back of your clicker say Press and hold the ON/OFF button until the top blue Power light flashes. Enter the 2 letter frequency code (AA).

    The code for each classroom (AA for DWE 1501) is posted near the podium at the front of the class. When a clicker is turned off it forgets any changes in frequency and the clicker frequency is again AA when the clicker is turned on.

    How do I know if my vote has been received?When the receiver acknowledges a vote, the Vote Status light on the clicker (the third light) will flash green for a moment. If it flashes red instead, then either the voting period hasn't started, or the receiver didn't respond to your vote. In the latter case, change the frequency of your clicker if necessary and vote again. Raise your hand for assistance if you don't see a green response.

    Can I change my vote/choice?Yes. While the voting process is active, you can vote as often as you like. Only your last (most recent) choice/vote is recorded.

    8

  • CS 338: Computer Applications in Business: Databases (Fall 2014)

    51992-2014 by Addison Wesley & Pearson Education, Inc., McGraw-Hill. Somematerial adapted and modified from Fundamentals of Database Systems (Elmasri et al.)

    Summary

    Be sure to check the course website regularly http://learn.uwaterloo.ca

    Material builds on itself Like other courses in Math Initial lectures focus on terminology and background knowledge May be an overwhelming amount of details

    Dont fall behind! If you have questions or concerns, please send me an email or talk to any of the

    TAs I am always open to suggestions and recommendations

    Do not cause a distraction9

    Why Take A Database Course?

    Database systems are at the core of Computer Science, Scientific Computing, Business, IT, Engineering, among many other disciplines

    They are important to our society

    Topic is intellectually rich

    Good to have it on your resume

    Easy to construct and use?

    10

    This course is designed primarily to meet the needs ofstudents who are interested in the business or publicsector of the economy. The course presents methods usedfor the storage, selection, and presentation of data.

  • CS 338: Computer Applications in Business: Databases (Fall 2014)

    61992-2014 by Addison Wesley & Pearson Education, Inc., McGraw-Hill. Somematerial adapted and modified from Fundamentals of Database Systems (Elmasri et al.)

    Why Take A Database Course? at the core of many disciplines

    Database systems have become in recent years an essential component

    Daily life activities involve some interaction with a database Shift in corporate strategies (i.e. standalone applications to

    Web applications)

    Need for database technology has increased significantly Web: vast pool of information Examples:

    Search engines, Wikis, Web Services, Electronic Commerce, Social Networks, etc

    11

    Why Take A Database Course? they are important to our society

    Knowledge is power Sir Francis Bacon

    12

  • CS 338: Computer Applications in Business: Databases (Fall 2014)

    71992-2014 by Addison Wesley & Pearson Education, Inc., McGraw-Hill. Somematerial adapted and modified from Fundamentals of Database Systems (Elmasri et al.)

    Why Take A Database Course? topic is intellectually rich

    Information Representation Database modeling/design

    Languages and Data Queries Complex queries

    Concurrency Control Controlling concurrent access to information

    Data Mining Discovering patterns in data

    Data Storage Software & hardware that can fit large amounts of data

    13

    Why Take A Database Course? good to have it on your resume

    Database systems are applied across many areas Electrical Engineering Software Development Mechanical Engineering Computer Science Science Business among many others

    Valuable in situations such as Use database (DB) terminology knowledgeably Understand DB concepts that arise in the workplace Interact with (direct, understand) IT personnel Understand technical articles involving DB technology Discuss DB concepts in a job interview

    14

  • CS 338: Computer Applications in Business: Databases (Fall 2014)

    81992-2014 by Addison Wesley & Pearson Education, Inc., McGraw-Hill. Somematerial adapted and modified from Fundamentals of Database Systems (Elmasri et al.)

    Why Take A Database Course? easy to construct and use?

    Developing a database may involve a great amount of work

    Good News: This is an introductory course Course workload is balanced throughout the semester

    15

    Databases and Database UsersChapter 1

    1992-2014 by Addison Wesley & Pearson Education, Inc., McGraw-Hill. Some material adapted and modified from Fundamentals of Database Systems (Elmasri et al.) Rice University Data Center

  • CS 338: Computer Applications in Business: Databases (Fall 2014)

    91992-2014 by Addison Wesley & Pearson Education, Inc., McGraw-Hill. Somematerial adapted and modified from Fundamentals of Database Systems (Elmasri et al.)

    what is a database?

    17

    Introduction

    Database Collection of related data Data known facts that can be recorded and that have implicit meaning

    A database has the following implicit properties Miniworld or universe of discourse (UoD) Represents some aspect of the real world Changes in the miniworld are reflected in the database

    Logically coherent collection of data with inherent meaning Random assortment of data cannot be referred to as database

    Designed, built and populated with data for a specific purpose

    Diff between database and collection of data is that it relates to real world

  • CS 338: Computer Applications in Business: Databases (Fall 2014)

    101992-2014 by Addison Wesley & Pearson Education, Inc., McGraw-Hill. Somematerial adapted and modified from Fundamentals of Database Systems (Elmasri et al.)

    Introduction

    Example of a large commercial database Amazon.com, eBay, Facebook, Twitter

    Database Management System (DBMS) Collection of interrelated data and programs that enable users to

    create and maintain a database

    DBMS is a general-purpose software system that facilitates the processes of defining, constructing, manipulating, and sharingdatabases among various users and applications Allows multiple users/programs to access and manipulate DB

    concurrently

    Protects DB against unauthorized access and manipulation

    Provides means to evolve DB and program behaviour as requirements change over time

    Introduction Basic Functions of DBMS

    Defining a database Specify the data types, structures, and constraints of the data to be

    stored Uses a Data Definition Language (DDL) Meta-data Database definition or descriptive information Stored by the DBMS in the form of a database catalog or dictionary

    Constructing a database Process of storing data on some storage medium controlled by the

    DBMS

  • CS 338: Computer Applications in Business: Databases (Fall 2014)

    111992-2014 by Addison Wesley & Pearson Education, Inc., McGraw-Hill. Somematerial adapted and modified from Fundamentals of Database Systems (Elmasri et al.)

    Introduction Basic Functions of DBMS

    Manipulating a database Query and update the database miniworld Generate reports Uses Data Manipulation Language (DML)

    Sharing a database Allow multiple users and programs to access the database

    simultaneously

    Introduction Basic Functions of DBMS

    Populating a database Inserting data to reflect the miniworld

    Application program Accesses database by sending queries to DBMS

    Query Causes some data to be retrieved

    e.g. retrieve bank account balance

    Transaction May cause some data to be read and some data to be written into the

    database e.g., buying a product, transferring funds

  • CS 338: Computer Applications in Business: Databases (Fall 2014)

    121992-2014 by Addison Wesley & Pearson Education, Inc., McGraw-Hill. Somematerial adapted and modified from Fundamentals of Database Systems (Elmasri et al.)

    Introduction Basic Functions of DBMS

    Protecting a database System protection against hardware or software malfunction (i.e. crashes)

    Security protection against unauthorized or malicious access

    Maintaining a database Allow the system to evolve as requirements change over time

    An Example

    UNIVERSITY database Information concerning students, courses, and grades in a university

    environment

    Data records STUDENT COURSE SECTION GRADE_REPORT PREREQUISITE

    Define structure of each type of data record by specifying data elements to include and data type for each element

    String (sequence of alphabetic characters)Numeric (integer or real)Date (year or year-month-day)Monetary amountetc.

    Very important

  • CS 338: Computer Applications in Business: Databases (Fall 2014)

    131992-2014 by Addison Wesley & Pearson Education, Inc., McGraw-Hill. Somematerial adapted and modified from Fundamentals of Database Systems (Elmasri et al.)

    An Example

    An Example Simplified Database System Environment

    Data Definition Language (DDL)

    Structured Query Language (SQL)

    Data Manipulation Language (DML)

    DBMS must ensure that only authorized users access database.

  • CS 338: Computer Applications in Business: Databases (Fall 2014)

    141992-2014 by Addison Wesley & Pearson Education, Inc., McGraw-Hill. Somematerial adapted and modified from Fundamentals of Database Systems (Elmasri et al.)

    To construct the UNIVERSITY database Store data to represent each student, course, section, grade report,

    and prerequisite as a record in appropriate file

    Define relationships among the records Example: record for Smith in the STUDENT file is related to two

    records in the GRADE_REPORT file Specifies Smiths grades in two sections or courses

    Database manipulation involves querying and updating

    An Example

    Examples of queries:

    Retrieve the transcript a list of all courses and grades of Smith

    List the names of students who took the section of the Database course offered in fall 2008 and their grades in that section

    List the prerequisites of the Database course

    An Example

  • CS 338: Computer Applications in Business: Databases (Fall 2014)

    151992-2014 by Addison Wesley & Pearson Education, Inc., McGraw-Hill. Somematerial adapted and modified from Fundamentals of Database Systems (Elmasri et al.)

    Examples of updates:

    Change the class of Smith to sophomore

    Create a new section for the Database course for this semester

    Enter a grade of A for Smith in the Database section of last semester

    Queries and updates must be specified precisely in the query language of the DBMS before they can be processed

    An Example

    Phases for designing a database:

    Requirements specification and analysis

    Documented in details

    Conceptual design

    Represented and manipulated using some computerized tools

    Logical design

    Expressed in a data model (i.e. Relational Data Model)

    Physical design

    Further specifications are provided for storing and accessing the database

    An Example Phases for Designing a Database

  • CS 338: Computer Applications in Business: Databases (Fall 2014)

    161992-2014 by Addison Wesley & Pearson Education, Inc., McGraw-Hill. Somematerial adapted and modified from Fundamentals of Database Systems (Elmasri et al.)

    Before Database Approach Traditional File Approach

    Traditional file processing

    Each user defines and implements the files needed for a specific software application

    System stores permanent records in various files it needs different programs to extract records from, and add records to,

    the appropriate files

    Before Database Approach Traditional File Approach

    Registrar Office Application

    Example

    Registrar Office Files

    Accounting Office Files

    Accounting Office

    Application

    Registrar Office Users

    Accounting Office Users

    .

    ..

    .

    ..

  • CS 338: Computer Applications in Business: Databases (Fall 2014)

    171992-2014 by Addison Wesley & Pearson Education, Inc., McGraw-Hill. Somematerial adapted and modified from Fundamentals of Database Systems (Elmasri et al.)

    Disadvantages1. Uncontrolled Redundancy & data inconsistency

    No form of supervision that can coordinate data operations

    2. Poor Enforcement of Standards Data names, formats, constraints, etc..

    are not standardized across an organization

    Before Database Approach Traditional File Approach

    Disadvantages3. Limited data sharing

    Each application has access only to its files (i.e. other applications do not have access to these files)

    4. Program-Data Dependency Any change to structure causes change

    in all programs accessing that file Description of files and data are

    embedded within an application

    Before Database Approach Traditional File Approach

    Also any changes in structure must

  • CS 338: Computer Applications in Business: Databases (Fall 2014)

    181992-2014 by Addison Wesley & Pearson Education, Inc., McGraw-Hill. Somematerial adapted and modified from Fundamentals of Database Systems (Elmasri et al.)

    Database Approach

    Database approach Overcomes the limitations of the traditional file approach Single repository maintains data that is defined once and then

    accessed by various users

    Main characteristics of database approach 1. Self-describing nature of a database system2. Insulation between programs and data, and data abstraction3. Support of multiple views of the data4. Sharing of data and multiuser transaction processing

    Database Approach

    Registrar Office Application

    Example

    Common Shared

    Database

    Accounting Office

    Application

    Registrar Office Users

    Accounting Office Users

    .

    ..

    .

    ..

  • CS 338: Computer Applications in Business: Databases (Fall 2014)

    191992-2014 by Addison Wesley & Pearson Education, Inc., McGraw-Hill. Somematerial adapted and modified from Fundamentals of Database Systems (Elmasri et al.)

    1. Persistence of data2. Transaction control3. Concurrency control4. Recovery control5. Querying6. Integrity control7. Data security8. Version control9. Performance tuning

    A DBMS in the Database Approach allows

    Characteristics of database approach Self-Describing Nature of a Database System

    Database system contains complete definition of structure and constraints This information (called meta-data) is stored in the DBMS catalog

    i.e. structure of each file, type and storage format of each data item, various constraints on data

    Database catalog used by: DBMS software Database users who need information

    about database structure

    DBMS software is not written for a specific database

    application

  • CS 338: Computer Applications in Business: Databases (Fall 2014)

    201992-2014 by Addison Wesley & Pearson Education, Inc., McGraw-Hill. Somematerial adapted and modified from Fundamentals of Database Systems (Elmasri et al.)

    Characteristics of database approach Self-Describing Nature of a Database System

    Characteristics of database approach Insulation Between Programs and Data

    Database systems have Program-data independence

    Structure of data files is stored in DBMS catalog separately from access programs

    Changes made to the structure of the database does not necessarily require changes to the programs (i.e. add new columns)

    Program-operation independence Some types of database systems enable definitions of operations on

    data as part of database definitions

    Operations are specified in two parts:

    Interface (or signature) includes operation name and data types of its arguments

    Implementation (or method) can be changed without affecting the interface

  • CS 338: Computer Applications in Business: Databases (Fall 2014)

    211992-2014 by Addison Wesley & Pearson Education, Inc., McGraw-Hill. Somematerial adapted and modified from Fundamentals of Database Systems (Elmasri et al.)

    The characteristic that allows program-data independence and program-operation independence is called data abstraction

    DBMS provides users with conceptual representation of data Does not include many of the details of how data is stored or how

    operations are implemented

    Data model Type of data abstraction used to provide conceptual representation It uses logical concepts such as objects, their properties, and their

    interrelationships Hides implementation and storage details

    Characteristics of database approach Insulation Between Programs and Data

    Characteristics of database approach Support of Multiple Views of the Data

    View Subset of the database, or Contain virtual data derived from the database files but is not

    explicitly stored

    Multiuser DBMS Users have a variety of distinct applications Must provide facilities for defining multiple views

    Can also limit number of users allowed to access data

    Ex. API's, such as Ebay's API to allow 3rd party plug-ins to use their software without the code itself

    Create as many views as you wish

  • CS 338: Computer Applications in Business: Databases (Fall 2014)

    221992-2014 by Addison Wesley & Pearson Education, Inc., McGraw-Hill. Somematerial adapted and modified from Fundamentals of Database Systems (Elmasri et al.)

    Characteristics of database approach Support of Multiple Views of the Data

    Characteristics of database approach Sharing of Data & Multiuser Transaction Processing

    Allow multiple users to access the database at the same time Example assigning seats for airline reservation systems

    Concurrency control software Ensure that several users trying to update the same data do so in a

    controlled manner Result of the updates is correct

    Online transaction processing (OLTP) application Ensure that concurrent transactions operate correctly and efficiently

  • CS 338: Computer Applications in Business: Databases (Fall 2014)

    231992-2014 by Addison Wesley & Pearson Education, Inc., McGraw-Hill. Somematerial adapted and modified from Fundamentals of Database Systems (Elmasri et al.)

    Transaction Definition: executing program or process that includes one or more

    database accesses (i.e. reading or updating records)

    DBMS must enforce some properties Isolation property

    Each transaction appears to execute in isolation from other transactions

    Atomicity property Either all the database operations in a transaction are executed or none

    are

    Characteristics of database approach Sharing of Data & Multiuser Transaction Processing

    Data Centers

    46Picture from Rice University: Data Center

  • CS 338: Computer Applications in Business: Databases (Fall 2014)

    241992-2014 by Addison Wesley & Pearson Education, Inc., McGraw-Hill. Somematerial adapted and modified from Fundamentals of Database Systems (Elmasri et al.)

    Data Centers

    47Picture from Microsoft: Data Center (Ireland)

    Data Centers

    48Picture from Google : Council Bluffs (Iowa)

  • CS 338: Computer Applications in Business: Databases (Fall 2014)

    251992-2014 by Addison Wesley & Pearson Education, Inc., McGraw-Hill. Somematerial adapted and modified from Fundamentals of Database Systems (Elmasri et al.)

    Data Centers

    Interested in getting more information about current Google Data Centers? Check http://www.google.com/about/datacenters/

    49