Database Normalization vs Denormalization
- Authors
- Name
- Amit Shekhar
- Published on
I am Amit Shekhar, Co-Founder @ Outcome School, I have taught and mentored many developers, and their efforts landed them high-paying tech jobs, helped many tech companies in solving their unique problems, and created many open-source libraries being used by top companies. I am passionate about sharing knowledge through open-source, blogs, and videos.
Join Outcome School and get high paying tech job: Outcome School
In this blog, we will learn about database normalization and denormalization.
Database normalization and denormalization are two different ways for database design, each with pros and cons. Let's learn today in detail.
Learning by example is the best way to learn, let's start with an example.
Consider the following users
table in a database:
id | name | username | city |
---|---|---|---|
1 | Amit Shekhar | amitshekhar | Gurgaon |
2 | Pallavi | pallavi | Gurgaon |
3 | Sachin | sachin | Mumbai |
4 | Rahul | rahul | Varanasi |
5 | Pankaj | pankaj | Mumbai |
Here, we can observe that the city Gurgaon
and Mumbai
is getting duplicated. Hence it is denormalized. This way of designing the database is called Database Denormalization.
Also, consider a scenario in which the city name may change due to some reason, for example, Gurgaon
to Gurugram
. We will have to change at multiple places, which may lead to data inconsistencies.
Now, we know the issues with the denormalized design. Let's fix this with the normalized design.
Consider the following users
table in a database:
id | name | username | cityId |
---|---|---|---|
1 | Amit Shekhar | amitshekhar | 1 |
2 | Pallavi | pallavi | 1 |
3 | Sachin | sachin | 2 |
4 | Rahul | rahul | 3 |
5 | Pankaj | pankaj | 2 |
Consider the following cities
table in a database:
id | name |
---|---|
1 | Gurgaon |
2 | Mumbai |
3 | Varanasi |
Here, city names are not getting duplicated, hence storage reduction.
Also, we can change the city name easily by changing it at a single place if got changed due to some reason, hence no data inconsistencies as it is easy to change.
Hence it is normalized. This way of designing the database is called Database Normalization.
Now that we have understood Database Normalization and Denormalization with the help of examples, let's see the pros, cons, and use cases of both ways.
Database Normalization
Database normalization is the way of designing the database focusing on minimizing data duplication and ensuring data integrity.
Pros:
- Minimizes data duplication, hence minimizes storage.
- Easy to update, ensures data integrity.
Cons:
- May slow down query performance, especially for queries that require joining tables.
Use Case:
The system in which the data integrity is important, for example, Banking Systems.
Database Denormalization
Database denormalization is the way of designing the database focusing on query performance.
Pros:
- Better performance as there is no need to join tables.
Cons:
- Increases data redundancy that leads to more storage requirements.
- Makes data updates more complex and prone to inconsistencies if not carefully managed.
Use Case:
The system in which the query performance is important, for example, Real-time Analytics Systems.
This was all about database normalization and denormalization.
That's it for now.
Thanks
Amit Shekhar
Co-Founder @ Outcome School
You can connect with me on:
Follow Outcome School on: