HAP 709: Healthcare Databases Introduction to Database Structures Janusz Wojtusiak, Ph.D. Fall 2010...

24
HAP 709: Healthcare Databases Introduction to Database Structures Janusz Wojtusiak, Ph.D. Fall 2010 Based on slides by: Farrokh Alemi, Ph.D. Francesco Loaiza, Ph.D. J.D.

Transcript of HAP 709: Healthcare Databases Introduction to Database Structures Janusz Wojtusiak, Ph.D. Fall 2010...

HAP 709: Healthcare Databases

Introduction to Database StructuresJanusz Wojtusiak, Ph.D.

Fall 2010

Based on slides by:Farrokh Alemi, Ph.D.

Francesco Loaiza, Ph.D. J.D.

What is a database?• Is an Excel table with students’ grades a

database?

• Is your notebook a database?

• Is a phonebook a database?

• Is the GMU schedule of classes a database?

• Is a medical record of a patient a database?

• Is a list of nurses working in a hospital a database?

What is database?• Database is a collection of data with

defined structure and purpose.

• Wikipedia: A Database is a structured collection of data which is managed to meet the needs of a community of users.

• Wordnet: Database is an organized body of related information.

What is computer database?• Computer database is a database stored

in a computer.

• It is usually managed by special software called Database Management System (DBMS).

• There are many DBMS systems available– Access, Oracle, MUMPS, dBASE, portgress,

sql server, mysql, db2, …

Objectives of this lecture

• Learn about flat, hierarchical, relational, and object-oriented databases

• Learn about information-less databases

If checking an information item takes a fraction of a second, why

is it that we search through billions of information items in a fraction

of a second?

Types of Data Structures

• Flat data

• Hierarchical data

• Relational data

• Object-oriented data

Flat Models

Student ID

NameMidterm

gradeFinal grade

Address Zip code ...

4561 Ali Safaie B A1311 Manor Park

22101 ...

7878Mike Smith

C B1619 Ozkan Street

44115 ...

8954Mike

Smith Jr.A C

2121 Euclid 563

22101 ...

Flat Data

• How do we keep two addresses for the same student?

• What if there are five addresses?

Flat DataAdvantages• Most software include free access to flat data files.  For a

small number of cases, flat databases do a reasonably fast job.

• Most analytical software use flat data.Disadvantages• Flat databases waste computer storage by requiring it to

keep information on items that logically cannot be available• It is almost impossible to design flat models for things with

varying numbers of properties• Flat databases are not conducive to complicated search

queries

Hierarchical models

Data models in which the relationship between higher and

lower items are inherited.

Example of Hierarchal Model

Person

Employee Patient

Contractor

Admin

Clinical

ICU

Clinic

Advantages of Hierarchical Models

Advantages

• Operations on parents save time and affect all children.

Disadvantages

• Many relationships are not hierarchical

Relational Databases

In a relational database, tables do not need to be of the same size

In a relational data base, one stores a record with related fields

as data.

ExampleTable for "Students grades" 

Student IDKey column

Name Mid-term Final

4561 Ali Ghadiri B A

7878 Mike Smith C B

8954 Mike Smith Jr. A C

Table for "Students' contact information"

Student ID Address Zip

8954 2121 Euclid 563 22101

4561 1311 Manor Park 22101

7878 1619 Ozkan Street 44115

Advantages of Relational Databases

• Data can be examined from many different perspectives. 

• No need to enter missing information for variables that are not logically possible.

• Easy to modify because adding new concepts involves adding new Tables, not altering old ones.

Object-oriented data models

Data are organized in the form of “objects” that represent real world entities. Each objects have its properties, that can be

regular values or other objects.

Advantages of Object-oriented models

Advantages

• High efficiency

• Use of the actual “real life” entities as objects

• Integration with object-oriented programming languages (C++, Java, C# …)

Disadvantages

• Lack of one good standard

Distributed data models

Data are kept in different settings and on different computers. Distributed databases need not only addresses for where the data

are but also need an audit trail

HAP 720

Advantages and Disadvantages of Distributed Databases

• Security of these databases are difficult to maintain.

• Many agreements must be made ahead of time.

• Data loss is limited to nodes affected. • Decentralized databases are more flexible

and allow different units to update and maintain their own data. 

• Variation in quality of data

Data-less Information Systems

Distributed Databases without data until need arises, less

problems with privacy of patients

Sometimes called federated databases.

 Components of a Data-less System

• Decoder  

• Communicator  

• Analysis

Advantages of the Data-less Information Systems

• The system is substantially less expensive than centralized registries as it requires no new equipment and little personnel. 

• The system does not require duplication of data in different databases.

Inductive Databases

Researchers investigate databases that can answer

questions about things which are not explicitly in that databases. They use artificial intelligence to

give plausible answers.

Take Home Lesson

Structure makes it possible to process and analyze large amount

of data