SPS Home    >   Dgreath    >   RDBMS    >   Database Normalization

DATABASE NORMALIZATION

The process of optimizing relational database (RDB) tables entails decomposition of the un-normalized [0NF] table into multiple tables to eliminate data redundancy.  Each relational table must be unique and consist of unique unordered columns of unique unordered rows containing consistent atomic data. Typically RDBs are only normalized to 3NF and denormalization is used to improve database performance. 

The degree of optimization is shown below:

[0NF] Zero Normal Form - Un-normalized

  • Primary key undefined
  • Non-atomic values present
  • Dependencies to non-key columns present
  • Candidate key rule violations present (duplicates, nulls, non-persistent)
  • Independent multi-value relations present
  • Join dependency issues
  • Temporal support absent
  • Business rules not expressed as database rules

[1NF] First Normal Form - Atomic Data

  • Must have a primary key defined
  • Every column's data is atomic (data within cannot be subdivided)
  • No column repeats similar data within the table

[2NF] Second Normal Form - Functionally Dependent

  • Must be in first normal form
  • Every non-key column is functionally dependent on the primary key (a definite relationship must exist between the data in the column and it's primary key)

[3NF] Third Normal Form - Solely Dependent

  • Must be in second normal form
  • Every non-key column is functionally dependent only on the primary key (no non-key column can have a definite relationship between the data in the column and another non-key column)

[BCNF] Boyce Codd Normal Form - Candidate Keys

  • Must be in third normal form
  • The only determinates (the means to select rows) are candidate keys

[4NF] Fourth Normal Form - Independent Multi-valued Relations

  • Must be in Boyce Codd normal form
  • All independent multiple relationships must be isolated out into different tables

[5NF] Fifth Normal Form -- Join Dependencies

  • Must be in fourth normal form
  • Every join dependency is implied by the candidate keys 
  • Joining the decomposed tables must produce the original table
  • The table will consist of only two columns, one key and one non-key (other than those required for 6NF)

Note: Although 5NF envisions only two column tables, as a practical matter closely related information could also exist in the table, as, for example, in the case of  Street, City, State, and Zip columns in a mail-list application. Phone numbers, however, would likely be in another linked table because an individual could have zero, one, or more assigned phone numbers.

[6NF] Sixth Normal Form - Bitemporal

  • Must be in fifth normal form
  • Must include columns for valid date-time (e.g. valid-from, valid-to)
  • Must include column for transaction date-time

Note: 6NF introduces the concept of bitemporality--the idea that a record exists in two time contexts, the transaction time (when the event occurred) and its validity interval.  For example, a person could marry on a certain date but they are married for the interval between that point and the earlier of either divorce or death.

[DKNF] Domain Key Normal Form - Rules Based

  • Must be in fifth, or possibly, sixth normal form
  • All business rules are expressed as database rules
  • Frequently unattainable