Database Normalization

Normalization is the process of efficiently organizing data in a database, and the goal of the normalization process is to eliminating redundant data (storing the same data in many tables) and also ensuring data dependencies are logical and make sense (storing related data in a table and link them based on relationships). Both of these are important since they reduce the amount of space the database can consume and ensure that data is logically stored. As an outcome, this helps reaching the goals such as performance, availability, consistency.

The Normal Forms

The database experts have suggested adopted a series of guidelines to ensure that the databases are “Normalized”. These are referred as normal forms and are numbered from one to five. In practical applications, and books they are referred as 1NF, 2NF, and 3NF along with the occasional 4NF. Fifth normal form which is very rarely used.

Let’s explore the normal forms.

First Normal Form (1NF)

First normal form (1NF) sets the basic to precede with the other normalization levels, and after 1NF normalization, the tables should meet the following criteria;

  • A row of data cannot contain repeating groups of similar data (atomicity)
  • Each row of data must have a unique identifier (or Primary Key)

Second Normal Form (2NF)

Second normal form (2NF) further addresses the concept of removing duplicative data, and after 2NF the tables should meet the following criteria;

  • Meet all the requirements of the first normal form.
  • Remove subsets of data that apply to multiple rows of a table, and placed them in separate tables.
  • Create relationships between those new tables and their predecessors with the use of foreign keys.

Third Normal Form (3NF)

Third normal form (3NF) goes one large step further and that should:

  • Meet all the requirements of the second normal form.
  • Remove columns that are not dependent upon the primary key.

Fourth Normal Form (4NF)

Fourth normal form (4NF) has one additional requirement, which can be achieved by restructuring the nature of the table arrangement, and the criteria are:

  • Meet all the requirements of the third normal form.
  • Eliminate all multi-valued dependencies.

These normalization guidelines are cumulative. For a database to be in the next level, it must first fulfill all the criteria below that level, and sometimes the database designers deviate from this in order to achieve the business logic.

Detailed steps of normalization will be discussed in the future posts, and check our website regularly for updates.

Share this post