INFO 340 Lecture 2 Intro to Databases. Book –Need it –Order individually through UW Bookstore...
-
Upload
jeremy-malone -
Category
Documents
-
view
216 -
download
0
Transcript of INFO 340 Lecture 2 Intro to Databases. Book –Need it –Order individually through UW Bookstore...
INFO 340
Lecture 2
Intro to Databases
• Book– Need it– Order individually through UW Bookstore– Or Amazon
Previous Class
• How many cards did your system require total?
• How difficult was it to answer the standard queries?– The ad-hoc queries?
What if -- iTunes Store via File Based Approach
• Imagine all details of a song encapsulated in a single file (or nest of files)• Problems
– Difficulty in accessing data• We need that particular application to access the data
– Duplication of data • do we want a separate copy of the album cover for every single song ?
– Data dependence • What if we want to get at the same data, but with a different application?
– File format incompatibilities• What if the applications use different file formats? Ie one uses text, another
XML, another binary, etc– Fixed queries
• Largely dependent on app developer– Results in fixed (non- ‘ad-hoc’) queries and/or– Multiple applications
Limitations of Files
• Separation and isolation of data• Duplication of data• Data Dependence• Incompatible file formats• Fixed queries / proliferation of apps• Two main problems emerge:
– Data definition is embedded in application– No control over access and data manipulation.
Introducing Databases
• Solves many problems introduced in a file system.
• Multi-user, simultaneous connections
• Security
• Separation of application and data
• Common query language
• One copy of data
Database Def (pg 15)
A shared collection of logically related data, and a description of this data, designed to meet the information needs of an organization
Types of databases
• Relational (we’ll focus on this type in this class)
• Network and Hierarchical model
• Object Oriented
Database are “Self Describing”
• System catalog• Data dictionary• Meta-data
It is this local or in-place description of the data that allows “program-data independence”
(BTW good quiz item)
Data abstraction
• Similar to programming classes– You can change the internal coding to the
object (instantiated class) as long as the external behavior (behavior presented to the outside world) doesn’t change
• Similarly, in databases, we can separate the data from the application
Logically related
• Entities -- distinct object
• Attributes -- describes some aspect of the object
• Relationships -- associations between the entities
DBMSDatabase Management System
• Provides facility to define the database via a Data Definition Language -- DDL
• Provides facility to store, retrieve, & manipulate data through Data Manipulation Language– SQL (Structured Query Language) is the standard– Note SQL & MySQL are 2 different things
• Controlled access to database:– Authorization & possibly authentication– Control of integrity/consistency– Handle multiple simultaneous users (concurrency)– Handle recovery -- return to a known, consistent
state– User-accessible catalog
DBMS
Database Application Program
• Program that interacts with a database.• Database can expose Views to the application.
Views help to– Present the data to the user in a useful format– Choose who sees what
• ie not all should see payroll
– Customize data presentation• Same db can have many different looks/appearances• Generally better than having multiple databases
– Allows db to hide any evolutionary structural changes to the database ‘behind the scenes’
Three (or four) levels of the DBMS
Database Design
• Data structure first & application second– Data structure will be the basis– Data structure is also harder to change on
the fly– At the design phase, want to think it
through as much as possible– Data design decisions can make
application design easier (or harder)
• Data Administrator -- “Data Monk”• Database Administrator• Logical Database Designer
– Business rules• Ie only HR or finance person can see payroll
• Physical Database Designer• Application developers• User types
– Naïve– Sophisticated
Database Design
DBMS Advantages
• Control (though not elimination) of redundancy• Consistency - reduce points of
update/modification– Caused by ‘same’ data existing in multiple places
• Data sharing• Integrity - use of constraints• Security - users, roles, privileges• Standardization forced• Concurrency• Backup & recovery
DBMS Disadvantages
• You pay for robustness
• Complexity - Lots of forethought
• Cost (upfront, but generally not over time)
• Higher impact of failure - centralization
• Conversion cost
Schema & Instance
Schema -- Description of the database
Instance -- A functioning database at any point in time
-- sometimes refers to a particular running installation
DDL & DML
• DDL– Language to specify the db
• System uses sets of tables itself to support user tables/schema
– System catalog– Data dictionary
• DML– Language to update the db & retrieve data– Procedural
• How– Non-Procedural
• What• SQL
DBMS Functions
• Data storage, retrieval, update• User-accessible catalog
– Data about data -- metadata– System catalog
• Transaction support -- ie ‘rollback’– Failure of db, network, etc before operation is complete
• Concurrency support• Authorization services• Data comm support• Integrity services• Utility services - imports, exports, analysis, monitoring/alerts
Some Behind the Scenes Processes of DBMS
• Query Processor• File Manager• DML preprocessor• DDL compiler - creates metadata tables• Authorization control• Integrity checker• Query optimizer• Transaction processor• Recovery manager
Entity Relationship Modeling/Entity Relationship Diagram (ERD) Introduction
Example - Used car business with multiple lots, multiple cars, multiple sales people
car
salesperson
Sales lots
color make
year
Engine sizename
address
Phone number
location
size
Lighted?
has
M
1
has
M
1
In-class work
• Groups of 5• Create ERD for following 3 scenarios
– iTunes (w/customers, songs, artists)– Live inventory (w/computers, users, & ‘on-
net’ snapshots)– Hacker attack incident database
• Attacker, victim, locations, computers (addresses, OS’s, other pertinent info,etc)
• Present your results for each
Assignment
• Read Chapter 3-4 of Connolly & Begg