Database Management Infsy 540 Dr. R. Ocker. Importance of data & information n Data and information...

Database Management

Infsy 540

Dr. R. Ocker

Importance of data & information

Data and information are corporate resources which must be managed.

Data Bases

The management of the information resource is now handled through data bases

We will cover how information used to be managed (file environments) and how it has progressed to its current state

Hierarchy of data organization in computer storage bit byte filed record file database

Hierarchy of Data Organization in Computer Storage

Hierarchy of Data Organization in Computer Storage

Component of DataOrganization

LogicalComponents

Physical(Storage)Components

Database

File

Record

Field(attribute)

Byte

Bit

Example

SUPPLIERS PARTS

SHIPMENTS

SUPPLIERS

NO. NAMESTREETADDRESS CITY ST ZIP

13 Gasket Co. 50 Oak Tifflin OH 44883

3251 Reliable Supp. 11 Cedar Teaneck NJ 07666

13 Gasket Co. 50 Oak Tifflin OH 44883

Reliable Suppliers

01000001 (represents “A” in theASCII-8 character code)

0

.

.

.

Traditional file environment

most orgs. began information processing on a small scale - automating one application at a time

systems grew independently - not according to a grand plan

typically, each division developed its own applications


within each division, – each functional area developed systems in

isolation from other functional areas– accounting, finance, manufacturing etc. all

developed their own systems and data files. traditional file processing

– encourages each functional area in a corporation to develop specialized applications


each application requires a unique data file– probably a subset of a larger master file

subsets of the master file lead to data redundancy, processing inflexibility and wasted storage resources

these “islands of information systems” made it difficult to integrate information

Problems with traditional file environment 1.data redundancy

– presence of duplicate data in multiple files– error prone

2. lack of flexibility– traditional file system cannot deliver ad hoc

reports– information needed for ad hoc reports is

somewhere in the system, but too difficult/expensive to easily retrieve

Problems with traditional file environment 3. poor security

– because there is little control or management of data, access to and dissemination of information is not controlled

4. lack of data sharing and availability– due to lack of control over data resource, not

easy to share data - pieces of information is in different files in different parts of organization

File environment

Data and the programs that use them are highly interdependent

DATA BASE Environment

data base - consists of data elements and the relationships between them

it is a collection of data organized to– service many applications at the same

time– by storing and managing data so that they

appear to be in one location

DATA BASE Environment

DBMS - database management system special software to create and maintain a

database and allow individual business applications to extract data they need without having to create separate files

DBMS promotes independence between data, programs, and the database

Logical and Physical views of Data DBMS separates the logical and physical

views of data logical view - presents data as they would

be perceived by end users or business specialists

physical view - shows how data are actually organized and structured on physical storage media (e.g. within the database)

DBMS and data definition

DBMS enables us to define a database on 3 levels

subschema schema physical schema

Schema

since there are many different pieces of data, we need a map showing how the data are associated

map sometimes called a data model or schema

complete logical view of the database

Schema

logical description of an entire database shows the relationships among the data

– chart of types of data that are used;– gives names of groups and relationships

between them; – framework

Subschema

logical description of the part of a database required by a particular function or application program

application programmer/user does not need to know about the entire database schema

neither schemas nor subschema reflects the way the data are stored physically

3 data descriptions 1. Subschema

– a chart of a portion of the data which is oriented to the needs of one or more programmers/users

2. Schema– a chart of the entire logical data base; overall

view as seen by database administrator 3. Physical schema

– chart of the physical layout of data on storage devices; view seen by systems programmers and systems designers

Advantages of DBMS reduces complexity by central management

of data, access, utilization and security data redundancy and inconsistency can be reduced by eliminating files in which the same data elements are repeated

program-data independence -- programs can be written independently of the physical layout of the data; if physical layout of data changes, applications are unaffected

Advantages of DBMS

program development and maintenance costs can be reduced

flexibility of IS can be enhanced by permitting rapid and inexpensive ad hoc queries

access and availability of info. can be increased

data base models

data model - method for organizing databases on the conceptual level

different types of database models - hierarchical, network, relational, object-oriented

Relational Model

data files are represented as tables– rows and columns– called relations

each relation is given a name column

– called an attribute and given a name row

– called a tuple– contains data

Tables/Relations

table rows are records for individual entities

table columns are fields of the records– describe the attributes of the entities

Relations

different subjects/topics are stored in separate tables– e.g. employee table and sales table

instance of a relation– the content of the relation at a particular

instant in time

Properties of relations

There is one column in the relation for each attribute of the relation.

Each column is given a name that is unique in the relation.

The order of the columns or attributes in the relation has no significance

The order of the rows is not significant. There cannot be any duplicate rows.

Relation Key relation key

– an attribute or set of attributes that uniquely identifies tuples (rows) in a relation.

A relation key is formally defined as a set of one or more relation attributes joined together.

all records in a table must have a unique primary key - can be a group of attributes combined to form a unique identifier

Designing a Relational Database

to design a database– break down the information you want to

keep as separate subjects– then determine how the subjects are

related to each other.

Steps

1. determine the purpose of the database– this will determine what information you

want from the database.– From this, you can determine what

subjects you need to store facts about (the tables) and what facts you need to store about each subject (the fields/attributes of the tables)

Steps 2. determine the tables

– can be the trickiest step in the database design process.

– The reports you want to print, the forms you want to use, the questions you want answered - don't necessarily provide clues about the structure of the tables that produce them.

– These things tell you what you want to know, but not how to categorize the information into tables.

Steps 3. determine the fields (attributes)

– decide what you need to know about the people, things, or events recorded in the table.

– Fields/attributes - describe characteristics of the table.

– Each record/row in the table contains the same set of fields

– Each field in the table should relate directly to the subject of the table.

Steps 4. Determine the primary key

– the power in a relational database management system

– comes from its ability to quickly search for, find and bring together information stored in separate tables.

– To do this, each table in the database should include a field or set of fields that uniquely identifies each individual row in the table.

– This is called a primary key.

Steps

5. determine the relationships– Once information has been divided into

tables, you need a way to bring it back together again in meaningful ways.

– You create relationships between the tables in your database.

– The DBMS uses the relationships to find associated information stored in the tables.

Steps

To set up a relationship between two tables - Table A and Table B– you add one table's primary key to the

other table, so that it appears in both tables.

Steps

6. refine your design

test your design by entering some records in each table - look at relationships.

Fix database where necessary.

Trends in Database Management

Distributed databases data warehouses data mining

Distributed processing and distributed data bases distributed processing

– the distribution of computer processing– among multiple geographically or

functionally separate locations– linked by a communications network

distributed database - one that is stored in more than one physical location

A System with a Distributed DatabaseA System with a Distributed Database

Tele-Tele-communicationscommunications

NetworkNetwork

Site 1 Site 3

Site 2Users haveaccess to theentire databaseover the network

DatabaseFragment 3

DatabaseFragment 1

DatabaseFragment 2

.

.

.

.

.

.

Distributed Data Bases

data are placed where they are used most often but entire database available to each user

enable structure of the database to mirror the structure of the org. – traffic on the network is lessened bec. data

maintained where they are used the most

Data warehouses

Subject-oriented, integrated collection of data, both internal and external

data accumulated over time maintained to support decision making

Data warehouses

Objective– to continually select data from operational

databases– transform the data into a uniform format– open warehouse to endusers through an

easy-to-use interface

Power of data warehouses

Offer users analytical tools such as decision support systems and on-line analytical processing for data mining

Data mining

Automated discovery of potentially significant relationships among various categories of data

use specialized software e.g. an insurance company discovers

the best predictors of the frequency of a certain type of claim

Database Management Infsy 540 Dr. R. Ocker. Importance of data & information n Data and information...

Documents

Transcript of Database Management Infsy 540 Dr. R. Ocker. Importance of data & information n Data and information...