Database Normalization vs Denormalization

I am Amit Shekhar, Co-Founder @ Outcome School, I have taught and mentored many developers, and their efforts landed them high-paying tech jobs, helped many tech companies in solving their unique problems, and created many open-source libraries being used by top companies. I am passionate about sharing knowledge through open-source, blogs, and videos.

Join Outcome School and get high paying tech job: Outcome School

Before we begin, we’d like to mention that we’ve launched our YouTube channel. Subscribe to the Outcome School YouTube Channel.

In this blog, we will learn about database normalization and denormalization.

Database normalization and denormalization are two different ways for database design, each with pros and cons. Let's learn today in detail.

Learning by example is the best way to learn, let's start with an example.

Consider the following users table in a database:

id	name	username	city
1	Amit Shekhar	amitshekhar	Gurgaon
2	Pallavi	pallavi	Gurgaon
3	Sachin	sachin	Mumbai
4	Rahul	rahul	Varanasi
5	Pankaj	pankaj	Mumbai

Here, we can observe that the city Gurgaon and Mumbai is getting duplicated. Hence it is denormalized. This way of designing the database is called Database Denormalization.

Also, consider a scenario in which the city name may change due to some reason, for example, Gurgaon to Gurugram. We will have to change at multiple places, which may lead to data inconsistencies.

Now, we know the issues with the denormalized design. Let's fix this with the normalized design.

Consider the following users table in a database:

id	name	username	cityId
1	Amit Shekhar	amitshekhar	1
2	Pallavi	pallavi	1
3	Sachin	sachin	2
4	Rahul	rahul	3
5	Pankaj	pankaj	2

Consider the following cities table in a database:

id	name
1	Gurgaon
2	Mumbai
3	Varanasi

Here, city names are not getting duplicated, hence storage reduction.

Also, we can change the city name easily by changing it at a single place if got changed due to some reason, hence no data inconsistencies as it is easy to change.

Hence it is normalized. This way of designing the database is called Database Normalization.

Now that we have understood Database Normalization and Denormalization with the help of examples, let's see the pros, cons, and use cases of both ways.

Database Normalization

Database normalization is the way of designing the database focusing on minimizing data duplication and ensuring data integrity.

Pros:

Minimizes data duplication, hence minimizes storage.
Easy to update, ensures data integrity.

Cons:

May slow down query performance, especially for queries that require joining tables.

Use Case:

The system in which the data integrity is important, for example, Banking Systems.

Database Denormalization

Database denormalization is the way of designing the database focusing on query performance.

Pros:

Better performance as there is no need to join tables.

Cons:

Increases data redundancy that leads to more storage requirements.
Makes data updates more complex and prone to inconsistencies if not carefully managed.

Use Case:

The system in which the query performance is important, for example, Real-time Analytics Systems.

This was all about database normalization and denormalization.

That's it for now.

Thanks

Amit Shekhar
Co-Founder @ Outcome School

You can connect with me on:

Follow Outcome School on:

Read all of our high-quality blogs here.

Database Normalization

Database Denormalization

Tags