Normalization is the process of breaking data down to reduce dependency and redundancy. There are six forms, but the original three are the most important:
First form is all about atomicity, so should be avoided:
- Series of columns with the same type of elements
- Columns containing lists of elements
This table wouldn't be normalized:
USER_ID USER EMAIL1 EMAIL2 EMAIL3 -------------------------------------------------------------------------------- 1 Mike mike@domain.com mike2@domain.com mike3@domain.com 2 Ann anni@domain.com anni2@domain.com anni3@domain.com
This neither:
USER_ID USER EMAILS -------------------------------------------------------------------------------- 1 Mike mike@domain.com, mike2@domain.com, mike3@domain.com 2 Ann anni@domain.com, anni2@domain.com, anni3@domain.com
The solution is to break down the table into the USER table and the EMAIL table.
USER_ID USER ---------------- 1 Mike 2 Ann USER_ID EMAIL ---------------- 1 mike@domain.com 1 mike2@domain.com
The second form is met when:
- The table is in 1NF
- There isn't any non-prime attribute dependent on any subset of any candidate key (where a candidate key is a minimal set of attributes with no row duplicates and a non-prime is any other attributes)
This table is not 2NF normalized, because BUILDING depends only on IP and not on the primary key composed by IP and LICENSE.
IP LICENSE BUILDING -------------------------------------------------------------------------------- 10.1.14.15 Office E11 10.1.14.15 Photoshop E11 10.1.14.16 Office E12 10.1.14.16 Photoshop E12
It's normalized by breaking down the table into the IP table and the LICENSE table.
IP BUILDING ---------------------------- 10.1.14.15 E11 10.1.14.16 E12 LICENSE IP ---------------------------- Office 10.1.14.15 Photoshop 10.1.14.15 Office 10.1.14.16 Photoshop 10.1.14.16
The third form is met when:
- The table is in 2NF
- Every non-prime attribute is non-transitively dependent on any candidate key.
This table is not 3NF normalized, because AGE depends on NAME which depends on the primary key ORDER_ID
ORDER_ID DATE NAME AGE ------------------------------------------------- 12453244 11-11-2010 Mike 22 12453245 12-11-2010 Clare 27
It's normalized by breaking down the table into the ORDER table and the USER table.
ORDER_ID DATE USER_ID ---------------------------------------- 12453244 11-11-2010 11 12453245 12-11-2010 12 USER_ID NAME AGE ------------------------------------------------- 11 Mike 22 12 Clare 27
The 3.5 form or Boyce-Codd Normal Form (BCNF) is met when:
A determinant is an attribute or group of attributes that determines an other attribute called functional dependency.
Determinant -> Functional dependency
For example an id is the determinant and the name the functional dependecy. This means each name can be associated with one id and not more. And this means once you know the id you can also know the name (there are not two names with the same id). For one determinant value there may be many functional dependencies.
This is a table not normalized in 3.5F:
ROOM TIME FLOOR --------------------- 1 14:00 1 2 15:00 1 3 16:00 2
This table is not normalized 3.5 because I could insert the same room in two different floors. But this is not possible!
Floor (determinant) -> Room (functional dependency).
Floor is not a candidate key but a subset of it (floor, room, time), so that's table is not compliant with the above rule.
It's normalized by breaking down the table:
ROOM TIME ------------- 1 14:00 2 15:00 3 16:00 ROOM FLOOR ------------- 1 1 2 1 3 2
Copyright © 2013 Welcome to the website of Davis Fiore. All Rights Reserved.