Database Tables

37
Database Tables

description

Database Tables. Review of Definitions. Database: A collection of interrelated data items that are managed as a single unit. Database Management System: The software application that organizes and retrieves data. Common DBMS’s are Oracle, Microsoft SQL Server, DB2, MySQL , and Access. - PowerPoint PPT Presentation

Transcript of Database Tables

Page 1: Database Tables

Database Tables

Page 2: Database Tables

Review of Definitions

• Database: A collection of interrelated data items that are managed as a single unit.

• Database Management System: The software application that organizes and retrieves data. Common DBMS’s are Oracle, Microsoft SQL Server, DB2, MySQL, and Access.

• Relational Database: A database consisting of tables which follow the rules specified by E.F. Codd.

Page 3: Database Tables

More Definitions

• Record: A horizontal row in a table. “Record” and “row” are used interchangeably.

• Field: A vertical column in a table. “Field” and “column” are used interchangeably.

Page 4: Database Tables

Table: A technical Definition

• The term “table” is loosely used to refer to any spreadsheet-like grid of data.

• In a relational database, however, “table” refers to the place in memory where the data is actually stored.

• All DBMS’s offer different ways to look at the data stored in tables, and these frequently look like tables.

• But officially, a table is the place where the data is actually stored—not just a view of the data.

Page 5: Database Tables

Table Example

• Here’s a table of the teams in this year’s Tour de France.• It contains all of the teams, and four fields of information about them.

Page 6: Database Tables

View Example

• A view looks like a table, but it is not the place where the data is stored:

• In this view, we are only showing three of the 22 teams, and only two of the four fields from the Teams table.

• It would be wasteful and confusing to store this information directly in the database along with the table it came from.

Page 7: Database Tables

To Repeat

• In a relational database, a table is where the data is actually stored.

• Other displays of data may look like tables, but if they are not actually the data containers, they are views (or queries), not tables.

Page 8: Database Tables

Entities

• Definition: An entity is a person, place, thing, event, or concept about which data is collected.

• Examples:– Person: Student, Employee, Voter– Place: City, County, State, Country– Thing: Vehicle, Part, Computer– Event: Game, Concert, Class– Concept: Political System, Team,

Page 9: Database Tables

Attributes

• Definition: An attribute is a unit fact that characterizes or describes an entity.

• Suggest attributes for some of the entities listed earlier.

Page 10: Database Tables

Entities, Attributes, Tables, Fields

• In good database design, entities become tables, and attributes become the fields (columns) in those tables.

• However, deciding whether something is an entity or an attribute isn’t always easy.

Page 11: Database Tables

Entity or attribute?

• Suppose a particular company assigns a company car to each employee.

• That car might be considered an attribute of the employee.

• However, a car is itself a thing about which data is collected: Make, model, license number, etc. This makes it an entity.

• So: is car an attribute or an entity?

Page 12: Database Tables

It depends

• If you are just recording one identifier about the cars, such as license plate number, you can consider it an attribute.

• If your database will include other information about the cars, you should treat “car” as an entity.

• In general, entities have attributes; attributes do not.

Page 13: Database Tables

Review Definitions

• Entity: Person, place, thing, event, or concept about which data is collected.

• Attribute: A unit fact that characterizes or describes an entity.

Page 14: Database Tables

Primary Keys

• I said before that a table in a database is where the data is actually stored.

• However, it takes more than just storing data to make a proper database table.

Page 15: Database Tables

It’s not easy being a table• Just containing data isn’t sufficient to be a table.• Consider the grid below. What might be wrong?

• There is no way to tell the first three rows apart.• In a proper database table, each row (record) is different from all

others.• The rows do not need to be unique in every column; rows can

have identical attributes.

Page 16: Database Tables

Good tables have primary keys• Definition: A primary key is a column or set of

columns that uniquely identifies each row in a table.

• A key is a way to identify a particular record in a table. The primary key is a field, or collection of fields, that allows for quick access to a record.

Page 17: Database Tables

Candidate Keys• For some entities, there may be more than

one field or collection of fields that can serve as the primary key.

• These are called candidate keys.

Page 18: Database Tables

Candidate Keys• All candidate keys can open the lock; • that is, they can uniquely identify a record in a

table.• Consider a table containing U of M students. • What are some possible candidate keys?

Page 19: Database Tables

Compound Keys• The primary key doesn’t have to be a single column. In many cases, the

combination of two or more fields can serve as the primary key. For example, consider a table of all of the courses offered at the University.

• The course number would not be sufficient to identify a single course; there are 101’s in many departments, and probably plenty of 373’s as well.

• Department won’t work either; there are many courses in IOE, as well as Psychology, German, and Economics.

• But when you combine department and course number, you have uniquely identified a course.

• Therefore, those two fields together can serve as the primary key for the Course table.

• Definition--Compound Key: A primary key consisting of two or more fields.

• Definition—Simple Key: A primary key consisting of one field.

Page 20: Database Tables

Choosing a Key• Picking/assigning a primary key can be tricky, and is a

matter of some controversy. • Some books recommend that you always create a

surrogate or artificial key; by default, Access does this for (or to?) you.

• The Databases Demystified/Goodsell approach: 1. if the table already has a natural primary key use it; 2. if not, then use a combination of fields that guarantees

uniqueness; 3. only if you can’t find a natural simple or compound key should

you resort to creating a surrogate key.

Page 21: Database Tables

“Natural” Keys• Many entities already have widely accepted and enforced

keys that you can use in your database—these are termed “natural” keys.

• Definition: A “natural key” is a pre-existing or ready-made field which can serve as the primary key for a table.

• For employees, social security number is widely used. You can have two Tim Joneses working for you, but only one 123-45-6789.

• For vehicles, you can use Vehicle Identification Number (VIN), a unique 17-character code that every vehicle manufactured is required to have.

• Most “natural” keys aren’t really natural; they are simply artificial or “surrogate” keys that someone else created and that are now widely recognized.

Page 22: Database Tables

Election Day• Databases Demystified spells out how to choose a primary key:

1. If there is only one candidate, choose it.2. Choose the candidate least likely to have its value change.

Changing primary key values once we store the data in tables is a complicated matter because the primary key can appear as a foreign key in many other tables. Incidentally, surrogate keys are almost always less likely to change compared with natural keys.

3. Choose the simplest candidate. The one that is composed of the fewest number of attributes is considered the simplest.

4. Choose the shortest candidate. This is purely an efficiency consideration. However, when a primary key can appear in many tables as a foreign key, it is often worth it to save some space with each one.

Page 23: Database Tables

Surrogate Keys• Occasionally, you will come across a table that has NO candidate

keys.• In this case, you must resort to a surrogate key (Databases

Demystified refers to this as an “act of desperation”)• Definition: A “surrogate key” is a field (usually numeric) added to

a table as the primary key when no natural keys are available.• When you have to resort to a surrogate key, don’t try to put any

meaning into it. • The values in surrogate key fields frequently become important

identification numbers: Social Security Numbers, Student IDs, VINs, SKUs (in stores), etc. They are frequently associated with some sort of ID card or tag.

Page 24: Database Tables

Surrogate Key Example

• Suppose you run a chicken farm with lots of chickens.• You want to be able to track the chickens—when they

were hatched, weights at certain ages, how many eggs they lay, etc.

• The chickens don’t have names, social security numbers, uniqnames, or any other simple way to tell them apart.

• Therefore, you put a numbered tag on each chicken. This number becomes the surrogate key in your Chickens table.

Page 25: Database Tables

Surrogate KeysIt is tempting to make the primary key be a sort of code: a grocery store might say “let’s make a product key out of department, brand, name, and size,” yielding a key value something like

“CEREAL-KELL-FROO-15”.

These are keys that manage to violate normalization rules in one data cell! If Post buys

Kelloggs, do you change the key for Froot Loops?

If you must have a surrogate key, make it an integer with no extra meaning attached.

Page 26: Database Tables

Natural vs. Surrogate Keys• Using a surrogate key when a natural key is available can defeat the

whole purpose of having a primary key. This example shows how using an AutoNumber primary key allows duplicate entries to be made (StaffID is the primary key):

If SocialSecurityNumber were the primary key, this duplicate entry would not be allowed by the DBMS.

Page 27: Database Tables

Formatting Conventions• Tables should generally be named as plural nouns, with the

first letter capitalized: Customers, Orders, Products, etc.• Table names should not have any spaces or punctuation in

them. The following are bad table names: Bob’s Customers, Back Orders, hours/day. (Details later.)

• When designing a table, the primary key field or fields should be at the far left of the table.

• It is common to see the primary key field(s) highlighted in some way—underlined, bold, with an asterisk, etc.

• In online quiz 2, we highlight the primary key in orange, like this:

Midterm Material!

Page 28: Database Tables

Rules for Table and Field Names

• No Spaces! Access will allow you to name a table “My Company’s Employees”, but this causes many problems down the road, especially when interfacing with VB.

• “Employees” would be a much better name • The whole database probably relates to “My

Company”, so there’s no need to include that in the table name. Even if there were—NO SPACES!!!!!

Page 29: Database Tables

Rules for Table and Field Names

• No punctuation: Most DBMS’s (except for Access) won’t allow any punctuation in the name of a field or table except the underscore (_).

• I’m not a fan of underscores, either—they tend to be obscured by hyperlinks and such.

• For this class, if your table or field needs a multi-word name, use InteriorCapitals.

Page 30: Database Tables

Access Keywords

• Access has many keywords with special meaning that cannot be used as field or table names.

• A complete list is here.

Page 31: Database Tables

Basic Table Design

• “Designing a table” means defining the fields (columns) in the table.

• Adding rows (data) to a table isn’t designing it—the proper term for adding rows is “populating” the table.

Page 32: Database Tables

Defining Fields

• In Excel, designing an column is easy—you just type a word or two in the cell at the top of a column.

• For a true relational database, designing columns takes more work. You must give a column a legal name (which varies somewhat between DBMS’s), a data type, and sometimes some constraints as well.

Page 33: Database Tables

Designing a “Uniqname” Column

• If you were creating a table of UM students in Excel, you might just type the word “Uniqname” in cell A1 to get started.

Page 34: Database Tables

Relational Database Column Design

• In a true relational database, you would define the column this way:– The name, “Uniqname”, is valid in every DBMS I

know about, so you can use that. (Valid field names are discussed later.)

– The data type would be Text in Access (some DBMS’s might call it something else—String or Varchar, for example)

– Constraints could include:• Minimum of 1 character• Maximum of 8 characters• Letters only; no numbers, spaces, or punctuation

Page 35: Database Tables

Practice• Which one of the following is a good table name? – IOE 373 Students– University– Stores– Miles/Gallon

• Which one of the following is NOT a good table name?– Women– Order Details– OrderDetails– Locations

Page 36: Database Tables

Foreign Keys

• Definition: A foreign key is a field (or fields) in a table that is not the primary key in that table, but IS the primary key in another table.

• Foreign keys are used for creating relationships (linking tables together), something we’ll look at in the next lecture.

Page 37: Database Tables

Don’t Forget!• Good relational database design is about optimizing

how the data is STORED, not how it is DISPLAYED.• Most “tables” you have seen—in books, in lectures, on

the web—were probably optimized for display, not for storage.

• Relational database tables are designed for consistency and to reduce redundancy. They are not designed for appearance.

• When we learn SQL and Visual Basic, we will look at various ways to display the data stored in relational database tables.