What is Normalization in the Database?

What is Relational Database?

Relational Database is an extremely simple way of thinking about and managing the data used in business. It is nothing more than the collection of tables and columns.

Why it is called as Relational?

It’s a simple reason. Information (Data) stored in tables.  And tables are independent. Even though, we can easily relate the tables. Tables have columns, rows and names. Tables can be related to each other if they each have a column with common type of information.

What is Normalization?

Normalization is the process of organizing or putting things (information of a business) right (sensibly and logically), making them normal.  It’s the process of reorganization of the elements of the data step-by-step by assuring the normal relationships.

First Normal Form:

à Have the data into separate tables where the data in each table is of a similar type.

à Give each table a Primary Key, which uniquely identifies a row of information.

EMPLOYEE Table

———————

Name (PK)

Age

Department

Depart_Manager

Depart_Address

SKILLS table

—————-

Name (PK)

Skill (PK)

Description

For simplicity assume, Name is Primary Key in EMPLOYEE table.  Since each employee may have several Skills, both Name and Skill is the Primary Key in SKILLS table.

Here Employee’s basic information and Skill’s information separated.  Each table has given the Primary Key.  The Primary Key uniquely identifies an Employee’s information. First Normalization achieved.

Now, if we look at the SKILLS table, even though Name and Skill together form a Primary Key, the Description is only depends on Skill, regardless of whose name is there, that is, Description is only depends on a part of the Primary Key of SKILLS table.  This leads to Second Normal Form.

Second Normal Form:

Here, we take out the data that only depends on a part of the Key.  In our example, we take the Skill and Description out from the SKILLS table, to the 3rd table.  Like the following,

EMPLOYEE Table

———————

Name (PK)

Age

Department

Depart_Manager

Depart_Address

EMPLOYEE_SKILLS table

——————————-

Name (PK)

Skill

SKILLS table

—————-

Skill (PK)

Description

What we achieved here?  Two things we discarded here.

One, for example, if four employees have same Skill means, the Skill description repeated four times.

Two, if one employee has unique Skill that no other employees have and if that employee left the company, then the Skill description would vanished out if we left with First Normal Form.  So, with a Second Normal Form, the Skill and Description will remain in the database even if no employee has the skill.  Skills can be added separately even before locating anyone who has them.

 

Third Normal Form:

3rd Normal Form means, remove anything in the tables that does not depend solely on the Primary Key.

In our example, if we look at EMPLOYEE table.  Department information does not solely depend on an Employee.  The Department information is meaningful and has relation with an Employee, only until the Employee works in that Department.  If an Employee moves out from one department to another department, we need to update his row with new department’s manager name and address.  Department information is independent of whether an Employee works there or not.  Sp Department information moved out to separate table, like the following,

EMPLOYEE Table

———————

Name (PK)

Age

Department

EMPLOYEE_SKILLS table

——————————-

Name (PK)

Skill

SKILLS table

—————-

Skill (PK)

Description

DEPARTMENT table

————————-

Department (PK)

Depart_Manager

Depart_Address

Always the data in the 3rd Normal Form is already automatically in 2nd and 1st Normal Form.  The concept behind 3rd Normal Form is, simply arrange the data so that the columns in each table, other than Primary Key, are solely dependent only on the whole Primary Key.

3rd Normal Form otherwise called as “the Key, the Whole Key or Nothing but the Key”.

This whole process of organizing the information sensibly and logically is called as Normalization.  It’s not that much difficult. Analyzing the “normal” relationships among the various elements of data is called as Normalization.  We need to understand that Normalization is a part of the process of analysis, not design.  Design of a database application includes many other considerations.

 

Also refer the below link

http://www.databasedev.co.uk/database_normalization_process.html

Advertisements

One Trackback

  1. […] https://technicalrecyclebin.wordpress.com/2012/09/16/what-is-normalization-in-the-database/ Like this:LikeBe the first to like this. […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: