Normal Forms
Introduction
Normalization is a process that is used for organizing the data in the database by removing data redundancy and improving data integrity. The primary goal of normalizing a given relation is to get rid of Insert, Delete, and Update abnormalities.
This multi-step process involves structuring the database's columns and tables to guarantee that the relationships between columns and tables are correctly enforced by database integrity constraints. In order to be more effective, the normalization procedure typically involves dividing huge tables into smaller ones. Edgar F. Codd established the First Normal Form in 1970, and later, other Normal Forms were defined.

Build your intuition. Is this statement true or false?
Normalization is done by dividing large tables into smaller ones.
Press true if you believe the statement is correct, or false otherwise.
First Normal Form (1NF)
The First Normal Form addresses the problem of atomicity. Atomicity here refers to a table's values not being able to be divided any further. To put it plainly, one cell cannot contain more than one value since, in such cases of composite or multi-valued attributes, the 1NF is violated. If each attribute in a relation is a single-valued attribute, the relation is in the first normal form.
In order to better grasp the concept of 1NF, consider the following example. Let’s say we have a table with the name, department, and office location of the employees of a given company, like so:


Second Normal Form (2NF)
The table must first be in First Normal Form in order for it to be in Second Normal Form. The second condition is for the table to not have any partial dependency. What this means is that the proper subset of the candidate key determines a non-prime attribute.
Let's use an example to grasp the concept of the Second Normal Form. For this and all following examples, imagine a scenario where one employee can work in multiple departments and every department can have more than one manager. Consider the following table:


Third Normal Form (3NF)
The same criterion applies as previously, namely that the table must be in 2NF before moving on to the Third Normal Form. Another requirement is that there be no transitive reliance on non-prime properties. That is, non-prime characteristics (those that do not constitute a candidate key) should not be reliant on other non-prime attributes in the same table. A transitive dependence is a functional dependency in which X → Z (X determines Z) indirectly, through X → Y and Y → Z (where Y → X is not the case).
Let us look at the following table and understand why it is not in 3NF:


Boyce Codd Normal Form (BCNF)
The Boyce Codd Normal Form is also known as 3.5NF since it is a higher version of 3NF which was developed to handle specific sorts of anomalies that 3NF did not solve. Once again, the table must fulfill the 3rd Normal Form before continuing to BCNF. Moreover, every Right-Hand Side (RHS) attribute of the functional dependencies should be dependent on the table's super key.
Now, let’s look at an example in order to better understand the principle of BCNF:

Moreover, note that one Department can be run by multiple Managers, but one Manager can only run one Department. This means that there is a dependency between the Department and the Manager attributes, i.e. the department depends on the manager’s ID (ManagerID -> Department).
You might wonder: why would this be a problem?
Well, the problem here is that the Department is a prime attribute, while the ManagerID is not a primary attribute, therefore, the table doesn’t satisfy the BCNF. In order to solve this issue and put the table in Boyce Codd Normal Form, we have to split the table into two tables, as shown below:

Try this exercise. Is this statement true or false?
For a form to be in Boyce Codd Normal Form, it as to be in the 3rd Normal Form, but doesn’t have to be in the 2nd and 1st Normal Forms.
Press true if you believe the statement is correct, or false otherwise.
One Pager Cheat Sheet
- Normalization is a
multi-step process
of structuring a database's columns and tables to conform to Edgar F. Codd's First Normal Form, thereby ensuring the integrity of the data and avoidingupdate, insert, and deletional abnormalities
. - By dividing a large table into smaller ones, normalization helps to achieve
First Normal Form
(1NF) and eliminates data redundancy and anomalies, as well as improving data integrity for smoother data entry and reducing inconsistency. - First Normal Form is a standard used to ensure that each attribute in a relation is a single-valued attribute.
- The Second Normal Form requires that the data is already in First Normal Form and that there is no partial dependency between the non-key attribute and the candidate key.
- The
Third Normal Form
requires that every non-prime attribute in a table must depend solely on the table's primary key and have notransitive functional dependence
upon other non-prime attributes. - To satisfy the Boyce Codd Normal Form, the table must first fulfill the third normal form and its super key must be the only determinant of the right hand side attributes, thus necessitating its split into more than one table if the existing structure does not adhere to its principles.
- To be truly in BCNF, a form must satisfy the 1st, 2nd, and 3rd Normal Form standards, and have
every Right-Hand Side (RHS) attribute of the functional dependencies
dependent on the super key.