Normalization in DBMS with Examples

Normalization in DBMS is the systematic process of organizing relational tables to reduce redundancy and prevent update, insert, and delete anomalies. It matters because a poorly designed payments table can store the same customer, bank, and transaction details many times, making corrections risky. After reading, you can normalize real schemas step by step.

Normalization in DBMS with examples sits at the center of relational database design, especially when schemas power banking, healthcare, SaaS, ed-tech, and e-commerce systems. It relies on keys, functional dependencies, multivalued dependencies, and lossless decomposition;

You will be able to identify the correct normal form, decompose tables safely up to BCNF, 4NF, and 5NF, write SQL DDL for normalized schemas, and answer GATE or interview questions on 1NF, 2NF, 3NF, and dependency preservation.

Who This Guide Is For

This guide is specifically designed for:

Core Concepts

Before applying types of normalization in DBMS, you need five foundations: attributes, keys, functional dependencies, anomalies, and decomposition. The normal forms in DBMS with examples progress from removing repeating groups in 1NF to handling join dependencies in 5NF and advanced domain-key constraints in DKNF.

1.Dependencies and Keys

A functional dependency says that one set of attributes determines another set. If AadhaarNumber determines LegalName and DateOfBirth in a citizen identity table, the dependency is AadhaarNumber → LegalName, DateOfBirth. In an industry setting, a PaymentGatewayTxnId can determine amount, timestamp, merchant account, and settlement status for a UPI transaction. Keys use dependencies to identify rows: a superkey uniquely identifies rows, a candidate key is a minimal superkey, a primary key is the chosen candidate key, and a foreign key points to a key in another table. Normalization starts by asking which facts depend on which identifiers, not by blindly splitting tables.

In GATE-style questions, first compute attribute closure. If X plus can derive all attributes in relation R, then X is a superkey; if no proper subset of X is a superkey, X is a candidate key.

Code Example

2.Anomalies and Redundancy

Normalization exists because redundant tables create anomalies. An update anomaly occurs when the same fact appears in many rows and one copy is changed but another is missed. An insert anomaly occurs when you cannot add one fact without another unrelated fact. A delete anomaly occurs when deleting one row accidentally removes the only record of another fact. A familiar example is a grocery billing sheet that repeats supplier phone numbers for every item sold; changing a supplier number becomes error-prone. In healthcare, a patient visit table that repeats doctor specialization and clinic license details for every appointment can lose clinic information if the last appointment row is deleted.

A table is not automatically normalized just because it has a primary key. If non-key facts repeat across rows or depend on the wrong determinant, anomalies can still exist.

Code Example

3.Unnormalized Form

Unnormalized Form, often called UNF or 0NF, is not a real normal form target; it is the raw state before normalization. It contains repeating groups, nested records, multi-valued cells, or columns such as Phone1, Phone2, Phone3. A familiar Indian example is a housing society spreadsheet where one flat row stores all resident names and vehicle numbers in comma-separated cells. An industry example is a logistics shipment extract where one row contains multiple scan events, hub names, and timestamps as a single JSON-like text field. UNF is common in Excel imports and API dumps, but relational design requires separating each independent fact into rows and tables.

UNF is the starting mess, not the destination. If a column needs splitting with commas, pipes, JSON parsing, or repeated numbered columns, the table is not in 1NF.

Code Example

4.First Normal Form

First normal form in DBMS requires atomic values, no repeating groups, and identifiable rows. Atomic means each cell contains one value of the correct type, not a list. In an IRCTC booking scenario, passenger names should not be stored as one comma-separated column inside a ticket row; each passenger on a PNR should be a separate row linked to the booking. In a pathology lab system, one invoice should not store all test codes in a single text field; each ordered test needs its own row so prices, reports, and status can be tracked independently. 1NF improves filtering, validation, indexing, and joins, but it does not solve partial or transitive dependencies.

A relation with multi-valued attributes violates 1NF. The standard fix is to create a separate row per value or a child table with a foreign key to the parent entity.

Code Example

5.Second Normal Form

Second Normal Form requires the table to be in 1NF and every non-prime attribute to depend on the whole candidate key, not just part of a composite key. This is relevant only when the key has multiple attributes. In an e-commerce order line table with key OrderId + ProductId, ProductName depends only on ProductId, while Quantity depends on the full line. That partial dependency violates 2NF. In SaaS billing, if a subscription add-on table is keyed by AccountId + AddOnCode, AddOnName and MonthlyRate may depend only on AddOnCode, so they belong in an add-on catalog. For deeper practice, compare this with Second Normal Form in DBMS.

2NF is about partial dependency on a composite candidate key. If the primary key has only one attribute, the relation is automatically free from partial dependency, though it may still violate 3NF or BCNF.

Code Example

6.Third Normal Form

3NF in DBMS requires the relation to be in 2NF and removes transitive dependencies among non-key attributes. A transitive dependency appears when Key → NonKey1 and NonKey1 → NonKey2. A familiar example is CustomerId → Pincode and Pincode → City, State; storing city and state repeatedly in the customer table creates update issues when postal data changes. In an insurance platform, PolicyId may determine AgentCode, and AgentCode may determine AgentName and BranchCity; agent details should live in an agent master table. 3NF is widely used in production OLTP databases because it balances clean design with practical dependency preservation.

The standard 3NF test for every functional dependency X to A is: X is a superkey, or A is a prime attribute. This is why some 3NF decompositions preserve dependencies that BCNF may not.

Code Example

7.Boyce-Codd Normal Form

BCNF is a stricter version of 3NF. For every non-trivial functional dependency X → Y, X must be a superkey. A relation can be in 3NF but not BCNF when a determinant is not a candidate key, especially with overlapping candidate keys. A familiar example is a coaching institute timetable where each Room at a time has one Mentor, and each Mentor at that time has one Room, but the stored relation also includes Topic. In an airline maintenance system, if a certified inspector determines a hangar for a shift, and aircraft checks are recorded in the same table, inspector-to-hangar facts may cause BCNF violations. BCNF removes stronger anomaly risks, but decomposition can sometimes lose dependency preservation.

Do not assume 3NF means BCNF. BCNF rejects every non-superkey determinant, while 3NF allows a dependency if the dependent attribute is prime.

Code Example

8.Fourth Normal Form

4NF handles multivalued dependencies, written as X ↠ Y, where one entity independently has multiple values of one attribute and multiple values of another attribute. If these independent lists are stored together, the table creates unnecessary combinations. A familiar example is a restaurant that has multiple cuisines and multiple delivery zones; cuisines and zones are independent facts, so combining them in one row set creates a Cartesian explosion. In healthcare, a doctor may speak multiple languages and practise at multiple clinics; language ability and clinic availability should be stored separately unless a specific language is available only at a specific clinic. 4NF requires BCNF plus no non-trivial multivalued dependency unless the determinant is a superkey.

A common 4NF question gives independent multivalued attributes. If R(A, B, C) has A ↠ B and A ↠ C independently, decompose into R1(A, B) and R2(A, C).

Code Example

9.Fifth Normal Form

5NF, also called Project-Join Normal Form, deals with join dependencies that cannot be handled by functional or multivalued dependencies alone. It applies when a valid fact is truly determined by the lossless combination of three or more projections. A familiar example is an event vendor marketplace where a caterer can serve certain dishes, operate in certain venues, and specific dish-venue combinations are allowed; storing every vendor-dish-venue triple may introduce false assumptions unless the business rules are decomposed correctly. In manufacturing, a supplier may be approved for parts, plants, and part-plant combinations; a supply record is valid only when all required pairwise approvals exist. 5NF is less common in day-to-day CRUD design but appears in advanced database theory and high-integrity rule systems.

5NF is rare in ordinary applications, but it is useful when business validity comes from multiple independent approval relationships that must join without creating spurious tuples.

Code Example

10.Domain-Key Normal Form

Domain-Key Normal Form is a theoretical ideal where every constraint on the database follows from domain constraints and key constraints. A domain constraint limits valid values, such as account status being only ACTIVE, FROZEN, or CLOSED. A key constraint ensures uniqueness and identity. A familiar example is a PAN verification table where PAN format, uniqueness, and taxpayer category domains capture most rules. In banking, a loan account table may enforce account number uniqueness, allowed loan types, sanctioned amount ranges, and valid risk grades. DKNF is difficult because many real rules are cross-row, temporal, or procedural, but thinking in DKNF helps developers move business rules from application code into database constraints where possible.

DKNF is usually discussed as an ideal normal form, not a routine production target. Mention domains, keys, and constraints clearly if asked in an interview.

Code Example

11.Sixth Normal Form

6NF decomposes relations so that each relation represents an irreducible fact, often with time validity. It is mainly associated with temporal databases and data vault-style modeling, not standard undergraduate normalization problems. A familiar example is a prepaid mobile plan where plan price, data quota, and validity days can change independently over time; storing all attributes in one row forces unnecessary versioning. In IoT manufacturing, a machine sensor may change calibration value, firmware version, and location at different times; separate temporal tables record each changing fact accurately. 6NF can increase join complexity, so it is chosen when history precision is more important than simple querying.

6NF is advanced and uncommon in GATE basics. For interviews, describe it as useful for temporal data where attributes change independently over time.

Code Example

12.Lossless Decomposition

Decomposition means splitting a relation into smaller relations. A decomposition must be lossless, meaning the original relation can be reconstructed by joining the decomposed tables without losing rows or creating spurious rows. A familiar example is splitting a vehicle registration table into Vehicle and Owner tables using RegistrationNumber and OwnerId; joining through the key should recover valid ownership facts. In ed-tech, splitting enrollment payments into Learner, CourseBatch, and PaymentReceipt tables must preserve which learner paid for which batch and receipt. Dependency preservation is a separate property: after decomposition, you should still be able to enforce original dependencies without joining too many tables. In theory exams, lossless join is mandatory; dependency preservation is preferred but BCNF may sacrifice it.

For binary decomposition of R into R1 and R2, the decomposition is lossless if the common attributes functionally determine all attributes of R1 or all attributes of R2.

Code Example

13.Controlled Denormalization

Denormalization is the deliberate introduction of redundancy after understanding the normalized design. It is not a substitute for normalization; it is a performance or reporting decision with synchronization rules. A familiar example is a food delivery app storing order_total in the order header even though it can be calculated from item rows, because the checkout and support screens need fast access. In analytics, a banking dashboard may store daily branch-level transaction totals instead of scanning millions of transaction rows for every report. Denormalization should be measured, documented, and protected with triggers, materialized views, batch jobs, or application-level consistency checks.

Do not call a messy, redundant schema denormalized. True denormalization starts from a correct normalized model and adds controlled redundancy for a proven read-performance need.

Code Example

Normalize by dependencies, not by table count. A schema with many tables can still be wrong, and a schema with fewer tables can be valid if every dependency is correctly represented.

Learning Path

Use this sequence to move from recognizing bad spreadsheet-style tables to solving advanced dependency questions confidently.

Frequently Asked Questions

What is normalization in DBMS with examples?

Normalization in DBMS is the process of organizing tables to reduce redundancy and prevent anomalies. For example, instead of storing customer city and state repeatedly in every order row, store pincode details in a separate postal table and reference it.

What are the types of normalization in DBMS?

The standard types are 1NF, 2NF, 3NF, BCNF, 4NF, and 5NF. Advanced discussions also include DKNF and 6NF, while denormalization is a separate controlled performance technique rather than a normal form.

What is the difference between 3NF and BCNF?

3NF allows a functional dependency X → A if X is a superkey or A is a prime attribute. BCNF is stricter: X must be a superkey for every non-trivial dependency, so every BCNF relation is in 3NF, but every 3NF relation is not necessarily in BCNF.

How do you implement normalization in SQL?

Start by listing dependencies, identifying candidate keys, and finding repeating, partial, transitive, or multivalued dependencies. Then create separate tables with primary keys, foreign keys, unique constraints, check constraints, and junction tables where needed.

When should you use 3NF instead of BCNF?

Use 3NF when dependency preservation is important and the remaining redundancy is acceptable. BCNF gives stronger anomaly removal, but some BCNF decompositions make it harder to enforce original dependencies without joins.

Is normalization always good for performance?

Normalization is excellent for data integrity in OLTP systems, but it can increase joins for read-heavy reports. Performance-critical analytics systems often use controlled denormalization, summary tables, or star schemas after the normalized source model is understood.

What is the most common mistake in normalization?

The most common mistake is splitting tables based on nouns rather than dependencies. Normalization decisions should be based on which attributes determine which other attributes, whether the join is lossless, and whether dependencies can still be enforced.

Can a table be in 2NF but not in 3NF?

Yes. A table can remove partial dependencies and still contain transitive dependencies, such as CustomerId → Pincode and Pincode → City. That table is in 2NF but violates 3NF until the pincode details are separated.

Interview Preparation

Normalization questions test whether you can reason from dependencies rather than memorize definitions. State the normal form, identify the violating dependency, and propose a lossless decomposition with keys.

Conceptual Questions

Why is normalization needed in DBMS? It reduces redundant storage of the same fact and prevents update, insert, and delete anomalies. The goal is data integrity first, with performance tuned separately when needed.
What is a partial dependency? A partial dependency occurs when a non-prime attribute depends on only part of a composite candidate key. It violates 2NF and is fixed by moving the partially dependent attributes into another relation.
What is a transitive dependency? A transitive dependency occurs when a key determines a non-key attribute, and that non-key attribute determines another non-key attribute. It violates 3NF unless the dependent attribute is prime under the formal 3NF condition.
Why is BCNF stricter than 3NF? BCNF requires every determinant in a non-trivial functional dependency to be a superkey. 3NF permits a dependency when the right-side attribute is prime, so BCNF removes more anomaly cases.

Applied / Problem-Solving Questions

Given R(A, B, C) with AB as key and B → C, which normal form is violated? B → C is a partial dependency because C depends on part of the composite key AB. The relation violates 2NF and should be decomposed into R1(B, C) and R2(A, B).
Given R(A, B, C) with A → B and B → C, what issue exists? A determines C through B, so there is a transitive dependency. Decompose into R1(A, B) and R2(B, C) if B is the determinant for C.
How do you check whether a decomposition is lossless? For binary decomposition, check whether the common attributes functionally determine one of the decomposed relations. If the common attribute is a key in one side, the join is lossless.
When would you denormalize a normalized schema? Denormalize only after measuring read bottlenecks, such as dashboard queries repeatedly aggregating large transaction tables. Use materialized views, summary tables, or derived columns with refresh rules.

The most tested normalization question is: identify the highest normal form of a relation from functional dependencies. The standard answer requires finding candidate keys, prime attributes, partial dependencies, transitive dependencies, and BCNF violations in that order.

Key Takeaways

Normalization in DBMS with examples becomes practical when you track dependencies carefully: 1NF removes repeating groups, 2NF removes partial dependency, 3NF removes transitive dependency, BCNF requires every determinant to be a superkey, and 4NF or 5NF handles advanced multivalued and join dependencies.

For GATE and interviews, the most tested points are candidate key calculation using closure, highest normal form identification, 3NF versus BCNF, lossless join decomposition, dependency preservation, and recognizing when a multivalued dependency violates 4NF.

The natural next step is Second Normal Form in DBMS, because 2NF is the first place where dependency reasoning becomes essential for composite-key schemas.

Menu

Normalization in DBMS: Normal Forms, Dependencies & SQL Examples

Course Lessons

Structure of DBMS Explained: Components, Diagram & Architecture

Applications of DBMS

Databases Types

Learn about Attributes in DBMS

A Quick Guide to Entities in DBMS

What are Keys in DBMS?

Primary Key and Unique Key in DBMS: Differences

Cardinality in DBMS: Types, Ratio, Constraints & SQL Examples

Domain Constraints in DBMS: Types, SQL Examples & Interview Guide

Integrity Constraints in DBMS: Types, Examples & SQL Code

A Quick Guide to Relational Model in DBMS

ER (Entity Relationship Model) in DBMS

DDL and DML Commands in SQL: Differences, Syntax & Examples

Relational Algebra in DBMS

Functional Dependency in DBMS with examples

Normalization in DBMS: Normal Forms, Dependencies & SQL Examples

Second Normal Form in DBMS

File Organization in DBMS

Indexing in DBMS

INDEX in SQL

SELF JOIN IN SQL

Natural Join in DBMS SQL

ACID Properties in DBMS with examples

Understanding the Difference between Rollback and Commit

NoSQL Databases: What are They?

OLAP vs OLTP: Differences

Star Schema and Snowflake Schema in Data Warehousing

Deep Understanding of Distributed Database in DBMS

Deadlock in DBMS

Who This Guide Is For

Core Concepts

1.Dependencies and Keys

Code Example

2.Anomalies and Redundancy

Code Example

3.Unnormalized Form

Code Example

4.First Normal Form

Code Example

5.Second Normal Form

Code Example

6.Third Normal Form

Code Example

7.Boyce-Codd Normal Form

Code Example

8.Fourth Normal Form

Code Example

9.Fifth Normal Form

Code Example

10.Domain-Key Normal Form

Code Example

11.Sixth Normal Form

Code Example

12.Lossless Decomposition

Code Example

13.Controlled Denormalization

Code Example

Learning Path

Frequently Asked Questions

What is normalization in DBMS with examples?

What are the types of normalization in DBMS?

What is the difference between 3NF and BCNF?

How do you implement normalization in SQL?

When should you use 3NF instead of BCNF?

Is normalization always good for performance?

What is the most common mistake in normalization?

Can a table be in 2NF but not in 3NF?

Interview Preparation

Conceptual Questions

Applied / Problem-Solving Questions

Key Takeaways

Mark Lesson Complete (Normalization in DBMS: Normal Forms, Dependencies & SQL Examples)