Database Tables. Review of Definitions Database: A collection of interrelated data items that are...

37
Database Tables
  • date post

    15-Jan-2016
  • Category

    Documents

  • view

    233
  • download

    0

Transcript of Database Tables. Review of Definitions Database: A collection of interrelated data items that are...

Page 1: Database Tables. Review of Definitions Database: A collection of interrelated data items that are managed as a single unit. Database Management System:

Database Tables

Page 2: Database Tables. Review of Definitions Database: A collection of interrelated data items that are managed as a single unit. Database Management System:

Review of Definitions

• Database: A collection of interrelated data items that are managed as a single unit.

• Database Management System: The software application that organizes and retrieves data. Common DBMS’s are Oracle, Microsoft SQL Server, DB2, MySQL, and Access.

• Relational Database: A database consisting of tables which follow the rules specified by E.F. Codd.

Page 3: Database Tables. Review of Definitions Database: A collection of interrelated data items that are managed as a single unit. Database Management System:

More Definitions

• Record: A horizontal row in a table. “Record” and “row” are used interchangeably.

• Field: A vertical column in a table. “Field” and “column” are used interchangeably.

Page 4: Database Tables. Review of Definitions Database: A collection of interrelated data items that are managed as a single unit. Database Management System:

Table: A technical Definition

• The term “table” is loosely used to refer to any spreadsheet-like grid of data.

• In a relational database, however, “table” refers to the place in memory where the data is actually stored.

• All DBMS’s offer different ways to look at the data stored in tables, and these frequently look like tables.

• But officially, a table is the place where the data is actually stored—not just a view of the data.

Page 5: Database Tables. Review of Definitions Database: A collection of interrelated data items that are managed as a single unit. Database Management System:

Table Example

• Here’s a table of the teams in this year’s Tour de France.• It contains all of the teams, and four fields of information about them.

Page 6: Database Tables. Review of Definitions Database: A collection of interrelated data items that are managed as a single unit. Database Management System:

View Example

• A view looks like a table, but it is not the place where the data is stored:

• In this view, we are only showing three of the 22 teams, and only two of the four fields from the Teams table.

• It would be wasteful and confusing to store this information directly in the database along with the table it came from.

Page 7: Database Tables. Review of Definitions Database: A collection of interrelated data items that are managed as a single unit. Database Management System:

To Repeat

• In a relational database, a table is where the data is actually stored.

• Other displays of data may look like tables, but if they are not actually the data containers, they are views (or queries), not tables.

Page 8: Database Tables. Review of Definitions Database: A collection of interrelated data items that are managed as a single unit. Database Management System:

Entities

• Definition: An entity is a person, place, thing, event, or concept about which data is collected.

• Examples:– Person: Student, Employee, Voter– Place: City, County, State, Country– Thing: Vehicle, Part, Computer– Event: Game, Concert, Class– Concept: Political System, Team,

Page 9: Database Tables. Review of Definitions Database: A collection of interrelated data items that are managed as a single unit. Database Management System:

Attributes

• Definition: An attribute is a unit fact that characterizes or describes an entity.

• Suggest attributes for some of the entities listed earlier.

Page 10: Database Tables. Review of Definitions Database: A collection of interrelated data items that are managed as a single unit. Database Management System:

Entities, Attributes, Tables, Fields

• In good database design, entities become tables, and attributes become the fields (columns) in those tables.

• However, deciding whether something is an entity or an attribute isn’t always easy.

Page 11: Database Tables. Review of Definitions Database: A collection of interrelated data items that are managed as a single unit. Database Management System:

Entity or attribute?

• Suppose a particular company assigns a company car to each employee.

• That car might be considered an attribute of the employee.

• However, a car is itself a thing about which data is collected: Make, model, license number, etc. This makes it an entity.

• So: is car an attribute or an entity?

Page 12: Database Tables. Review of Definitions Database: A collection of interrelated data items that are managed as a single unit. Database Management System:

It depends

• If you are just recording one identifier about the cars, such as license plate number, you can consider it an attribute.

• If your database will include other information about the cars, you should treat “car” as an entity.

• In general, entities have attributes; attributes do not.

Page 13: Database Tables. Review of Definitions Database: A collection of interrelated data items that are managed as a single unit. Database Management System:

Review Definitions

• Entity: Person, place, thing, event, or concept about which data is collected.

• Attribute: A unit fact that characterizes or describes an entity.

Page 14: Database Tables. Review of Definitions Database: A collection of interrelated data items that are managed as a single unit. Database Management System:

Primary Keys

• I said before that a table in a database is where the data is actually stored.

• However, it takes more than just storing data to make a proper database table.

Page 15: Database Tables. Review of Definitions Database: A collection of interrelated data items that are managed as a single unit. Database Management System:

It’s not easy being a table• Just containing data isn’t sufficient to be a table.• Consider the grid below. What might be wrong?

• There is no way to tell the first three rows apart.• In a proper database table, each row (record) is different from

all others.• The rows do not need to be unique in every column; rows can

have identical attributes.

Page 16: Database Tables. Review of Definitions Database: A collection of interrelated data items that are managed as a single unit. Database Management System:

Good tables have primary keys• Definition: A primary key is a column or set of

columns that uniquely identifies each row in a table.

• A key is a way to identify a particular record in a table. The primary key is a field, or collection of fields, that allows for quick access to a record.

Page 17: Database Tables. Review of Definitions Database: A collection of interrelated data items that are managed as a single unit. Database Management System:

Candidate Keys• For some entities, there may be more than

one field or collection of fields that can serve as the primary key.

• These are called candidate keys.

Page 18: Database Tables. Review of Definitions Database: A collection of interrelated data items that are managed as a single unit. Database Management System:

Candidate Keys• All candidate keys can open the lock; • that is, they can uniquely identify a record in a

table.• Consider a table containing U of M students. • What are some possible candidate keys?

Page 19: Database Tables. Review of Definitions Database: A collection of interrelated data items that are managed as a single unit. Database Management System:

Compound Keys• The primary key doesn’t have to be a single column. In many cases,

the combination of two or more fields can serve as the primary key. For example, consider a table of all of the courses offered at the University.

• The course number would not be sufficient to identify a single course; there are 101’s in many departments, and probably plenty of 373’s as well.

• Department won’t work either; there are many courses in IOE, as well as Psychology, German, and Economics.

• But when you combine department and course number, you have uniquely identified a course.

• Therefore, those two fields together can serve as the primary key for the Course table.

• Definition--Compound Key: A primary key consisting of two or more fields.

• Definition—Simple Key: A primary key consisting of one field.

Page 20: Database Tables. Review of Definitions Database: A collection of interrelated data items that are managed as a single unit. Database Management System:

Choosing a Key

• Picking/assigning a primary key can be tricky, and is a matter of some controversy.

• Some books recommend that you always create a surrogate or artificial key; by default, Access does this for (or to?) you.

• The Databases Demystified/Goodsell approach: 1. if the table already has a natural primary key use it; 2. if not, then use a combination of fields that guarantees

uniqueness; 3. only if you can’t find a natural simple or compound key

should you resort to creating a surrogate key.

Page 21: Database Tables. Review of Definitions Database: A collection of interrelated data items that are managed as a single unit. Database Management System:

“Natural” Keys• Many entities already have widely accepted and enforced

keys that you can use in your database—these are termed “natural” keys.

• Definition: A “natural key” is a pre-existing or ready-made field which can serve as the primary key for a table.

• For employees, social security number is widely used. You can have two Tim Joneses working for you, but only one 123-45-6789.

• For vehicles, you can use Vehicle Identification Number (VIN), a unique 17-character code that every vehicle manufactured is required to have.

• Most “natural” keys aren’t really natural; they are simply artificial or “surrogate” keys that someone else created and that are now widely recognized.

Page 22: Database Tables. Review of Definitions Database: A collection of interrelated data items that are managed as a single unit. Database Management System:

Election Day

• Databases Demystified spells out how to choose a primary key:1. If there is only one candidate, choose it.2. Choose the candidate least likely to have its value change.

Changing primary key values once we store the data in tables is a complicated matter because the primary key can appear as a foreign key in many other tables. Incidentally, surrogate keys are almost always less likely to change compared with natural keys.

3. Choose the simplest candidate. The one that is composed of the fewest number of attributes is considered the simplest.

4. Choose the shortest candidate. This is purely an efficiency consideration. However, when a primary key can appear in many tables as a foreign key, it is often worth it to save some space with each one.

Page 23: Database Tables. Review of Definitions Database: A collection of interrelated data items that are managed as a single unit. Database Management System:

Surrogate Keys• Occasionally, you will come across a table that has NO candidate

keys.• In this case, you must resort to a surrogate key (Databases

Demystified refers to this as an “act of desperation”)• Definition: A “surrogate key” is a field (usually numeric) added to

a table as the primary key when no natural keys are available.• When you have to resort to a surrogate key, don’t try to put any

meaning into it. • The values in surrogate key fields frequently become important

identification numbers: Social Security Numbers, Student IDs, VINs, SKUs (in stores), etc. They are frequently associated with some sort of ID card or tag.

Page 24: Database Tables. Review of Definitions Database: A collection of interrelated data items that are managed as a single unit. Database Management System:

Surrogate Key Example

• Suppose you run a chicken farm with lots of chickens.• You want to be able to track the chickens—when they

were hatched, weights at certain ages, how many eggs they lay, etc.

• The chickens don’t have names, social security numbers, uniqnames, or any other simple way to tell them apart.

• Therefore, you put a numbered tag on each chicken. This number becomes the surrogate key in your Chickens table.

Page 25: Database Tables. Review of Definitions Database: A collection of interrelated data items that are managed as a single unit. Database Management System:

Surrogate KeysIt is tempting to make the primary key be a sort of code: a grocery store might say “let’s make a product key out of department, brand, name, and size,” yielding a key value something like

“CEREAL-KELL-FROO-15”.

These are keys that manage to violate normalization rules in one data cell! If Post buys

Kelloggs, do you change the key for Froot Loops?

If you must have a surrogate key, make it an integer with no extra meaning attached.

Page 26: Database Tables. Review of Definitions Database: A collection of interrelated data items that are managed as a single unit. Database Management System:

Natural vs. Surrogate Keys• Using a surrogate key when a natural key is available can defeat the

whole purpose of having a primary key. This example shows how using an AutoNumber primary key allows duplicate entries to be made (StaffID is the primary key):

If SocialSecurityNumber were the primary key, this duplicate entry would not be allowed by the DBMS.

Page 27: Database Tables. Review of Definitions Database: A collection of interrelated data items that are managed as a single unit. Database Management System:

Formatting Conventions• Tables should generally be named as plural nouns, with the

first letter capitalized: Customers, Orders, Products, etc.• Table names should not have any spaces or punctuation in

them. The following are bad table names: Bob’s Customers, Back Orders, hours/day. (Details later.)

• When designing a table, the primary key field or fields should be at the far left of the table.

• It is common to see the primary key field(s) highlighted in some way—underlined, bold, with an asterisk, etc.

• In online quiz 2, we highlight the primary key in orange, like this:

Midterm Material!

Page 28: Database Tables. Review of Definitions Database: A collection of interrelated data items that are managed as a single unit. Database Management System:

Rules for Table and Field Names

• No Spaces! Access will allow you to name a table “My Company’s Employees”, but this causes many problems down the road, especially when interfacing with VB.

• “Employees” would be a much better name • The whole database probably relates to “My

Company”, so there’s no need to include that in the table name. Even if there were—NO SPACES!!!!!

Page 29: Database Tables. Review of Definitions Database: A collection of interrelated data items that are managed as a single unit. Database Management System:

Rules for Table and Field Names

• No punctuation: Most DBMS’s (except for Access) won’t allow any punctuation in the name of a field or table except the underscore (_).

• I’m not a fan of underscores, either—they tend to be obscured by hyperlinks and such.

• For this class, if your table or field needs a multi-word name, use InteriorCapitals.

Page 30: Database Tables. Review of Definitions Database: A collection of interrelated data items that are managed as a single unit. Database Management System:

Access Keywords

• Access has many keywords with special meaning that cannot be used as field or table names.

• A complete list is here.

Page 31: Database Tables. Review of Definitions Database: A collection of interrelated data items that are managed as a single unit. Database Management System:

Basic Table Design

• “Designing a table” means defining the fields (columns) in the table.

• Adding rows (data) to a table isn’t designing it—the proper term for adding rows is “populating” the table.

Page 32: Database Tables. Review of Definitions Database: A collection of interrelated data items that are managed as a single unit. Database Management System:

Defining Fields

• In Excel, designing an column is easy—you just type a word or two in the cell at the top of a column.

• For a true relational database, designing columns takes more work. You must give a column a legal name (which varies somewhat between DBMS’s), a data type, and sometimes some constraints as well.

Page 33: Database Tables. Review of Definitions Database: A collection of interrelated data items that are managed as a single unit. Database Management System:

Designing a “Uniqname” Column

• If you were creating a table of UM students in Excel, you might just type the word “Uniqname” in cell A1 to get started.

Page 34: Database Tables. Review of Definitions Database: A collection of interrelated data items that are managed as a single unit. Database Management System:

Relational Database Column Design

• In a true relational database, you would define the column this way:– The name, “Uniqname”, is valid in every DBMS I

know about, so you can use that. (Valid field names are discussed later.)

– The data type would be Text in Access (some DBMS’s might call it something else—String or Varchar, for example)

– Constraints could include:• Minimum of 1 character• Maximum of 8 characters• Letters only; no numbers, spaces, or punctuation

Page 35: Database Tables. Review of Definitions Database: A collection of interrelated data items that are managed as a single unit. Database Management System:

Practice• Which one of the following is a good table name? – IOE 373 Students– University– Stores– Miles/Gallon

• Which one of the following is NOT a good table name?– Women– Order Details– OrderDetails– Locations

Page 36: Database Tables. Review of Definitions Database: A collection of interrelated data items that are managed as a single unit. Database Management System:

Foreign Keys

• Definition: A foreign key is a field (or fields) in a table that is not the primary key in that table, but IS the primary key in another table.

• Foreign keys are used for creating relationships (linking tables together), something we’ll look at in the next lecture.

Page 37: Database Tables. Review of Definitions Database: A collection of interrelated data items that are managed as a single unit. Database Management System:

Don’t Forget!• Good relational database design is about optimizing

how the data is STORED, not how it is DISPLAYED.• Most “tables” you have seen—in books, in lectures,

on the web—were probably optimized for display, not for storage.

• Relational database tables are designed for consistency and to reduce redundancy. They are not designed for appearance.

• When we learn SQL and Visual Basic, we will look at various ways to display the data stored in relational database tables.