Normalization - Normal Forms

2
1 Normalization There is a concept of ‘Normal Forms’ that is often 1 used to design and judge the quality of desig n of databa ses. There are ve normal forms. They’re numbered—co nv enien tly enoug h—one thorough ve. There are also a bunch of other inte rme dia te forms named afte r someth ing or other (usually whoever came up with it). Higher numbered normal forms have all the goodness qualities of the lower forms. For example, third-normal form database is also in second-normal form and rst normal form. Without putting too ne a point on it, we want our databases to be in as higher normal form as possible. Higher is better. Generally, you’re only concerned about the rst three normal forms—and in this class, we’ll also be concerned with Boyce-Codd Normal Form. 1.1 Fi rst-Normal Form This is the most basic normal form, and the only require ment is that data is stored in tables. If your data is stored in tables, then you’ve achieved rst-normal form. 1.2 Second Normal F orm Ok, here it goes: “A database is in second-normal form if it is in rst-normal form and every attribute is fully functionally dependent on the primary key.” And now to explain it: primar y keys are elds that uniquely iden tify a record. Att ribut es are everything else in the record. Now, functionally dependent means that given a primary key, we can get the value of any attribute. F or example, give n your studen t id, we can nd you r rst name and last name: your rst and last name are functionally dependent on your student id. The fully functionally dependent mostl y applies to composite primary keys. It basically means that the attribute needs to be functionally dependent on the whole primary key (not one of its parts). For example, some applications use rst name, last name, and date of birth as a composite primary key. Every attribute in that record needs to depend on all rst name, last name, and date of birth. 1 .2 .1 De si gn It is very easy to achieve second-nor mal form by simply choosing a unique abstra ct primary key that doesn’t depend on the data. i.e.: adding an auto-increment pri mary key to your tables. 1 Which is to say never . 1

Transcript of Normalization - Normal Forms

Page 1: Normalization - Normal Forms

8/7/2019 Normalization - Normal Forms

http://slidepdf.com/reader/full/normalization-normal-forms 1/2

1 Normalization

There is a concept of ‘Normal Forms’ that is often1 used to design and judge the quality of design of databases.

There are five normal forms. They’re numbered—conveniently enough—one thorough

five. There are also a bunch of other intermediate forms named after something or other(usually whoever came up with it).

Higher numbered normal forms have all the goodness qualities of the lower forms. Forexample, third-normal form database is also in second-normal form and first normal form.Without putting too fine a point on it, we want our databases to be in as higher normalform as possible. Higher is better.

Generally, you’re only concerned about the first three normal forms—and in this class,we’ll also be concerned with Boyce-Codd Normal Form.

1.1 First-Normal Form

This is the most basic normal form, and the only requirement is that data is stored in tables.If your data is stored in tables, then you’ve achieved first-normal form.

1.2 Second Normal Form

Ok, here it goes:

“A database is in second-normal form if it is in first-normal form and everyattribute is fully  functionally dependent on the primary key.”

And now to explain it: primary keys are fields that uniquely identify a record. Attributes

are everything else in the record.Now, functionally dependent  means that given a primary key, we can get the value of any

attribute. For example, given your student id, we can find your first name and last name:your first and last name are functionally dependent  on your student id.

The fully functionally dependent  mostly applies to composite primary keys. It basicallymeans that the attribute needs to be functionally dependent on the whole  primary key (notone of its parts). For example, some applications use first name, last name, and date of birth as a composite primary key. Every attribute in that record needs to depend on all  firstname, last name, and date of birth.

1.2.1 Design

It is very easy to achieve second-normal form by simply choosing a unique abstract primarykey that doesn’t depend on the data. i.e.: adding an auto-increment primary key to yourtables.

1Which is to say never .

1

Page 2: Normalization - Normal Forms

8/7/2019 Normalization - Normal Forms

http://slidepdf.com/reader/full/normalization-normal-forms 2/2

1.3 Third-Normal Form

“A database is in third-normal form if it is in second-normal form and containsno transitive attribute dependencies.”

This one is a bit easier. What this means is that none of our attributes imply any otherattributes. For example, in a “Person” table, we might have person’s phone, address, and zipcode. More often than not, the zip code will imply part of the street address, and possiblythe phone’s area code. This is precisely what third-normal form must avoid.

You can’t generally avoid this problem in real life—when you’re using someone else’sidentification systems in your databases. Let’s just say that for ‘common’ things like ad-dresses and zip codes, it is ‘ok’, while for your own data, no field should imply other fields(ie: you shouldn’t have ‘date of birth’ and ‘age’ in your database, etc.).

1.4 Boyce-Codd Normal Form

“A relation R is in Boyce-Codd normal form if its primary key, K , implies allnonkey attributes—a,b ,c , . . .—and K  is a superkey.”

Well, a key is a superkey  if it implies all other attributes and no portion of K  less than thewhole implies all other attributes.

Basically if we pick a new unique id for each record (the auto-increment, etc.) and pickfields ‘appropriately’ then we’ll get Boyce-Codd normal form.

The Boyce-Codd normal form doesn’t exactly fit into the numbered normal form idea: itstarts with normal form (assumes that data is in tables). It has been proved2 that Boyce-Codd is also in third-normal form. So we can consider Boyce-Codd as a 3.5-normal form.

2By someone.

2