What Normalization Achieves
Normalization is a technique for producing a set of relations with desirable properties given the data requirements of an enterprise. A well-normalized set of relations uses the minimal number of attributes necessary to support those requirements, groups attributes with close logical relationships together, and minimizes redundancy so each attribute appears only once — except for foreign keys, which must be repeated to link relations together.
Functional Dependencies
Functional dependencies describe the relationship between attributes. When a functional dependency is present, it acts as a constraint: if you know the value of one attribute or set of attributes, you can determine the value of another. A determinant is the attribute or group of attributes on the left-hand side of the dependency arrow — the one doing the determining.
Full Functional Dependency
If A and B are attributes of a relation, B is fully functionally dependent on A if B depends on A but not on any proper subset of A. In other words, you need all of A to determine B — removing any part of A breaks the dependency. This concept becomes critical when dealing with composite keys.
First Normal Form
A relation is in first normal form when the intersection of each row and column contains one and only one atomic value. No repeating groups, no nested tables, no arrays — just single values in every cell. This is the foundation for all higher normal forms.
Second Normal Form
A relation is in second normal form if it is in first normal form and every non-primary-key attribute is fully functionally dependent on the primary key. This eliminates partial dependencies, where an attribute depends on only part of a composite key rather than the whole thing. The general definition extends this: every non-candidate-key attribute must be fully functionally dependent on any candidate key, where a candidate-key attribute is part of any candidate key.
Third Normal Form
A relation is in third normal form if it is in first and second normal form and no non-primary-key attribute is transitively dependent on the primary key. A transitive dependency exists when A determines B, B determines C, and therefore C is indirectly determined by A through B — provided A is not functionally dependent on B or C. Third normal form eliminates these indirect dependencies by breaking them into separate relations.
The general definition refines this further: no non-candidate-key attribute can be transitively dependent on any candidate key, where a candidate-key attribute is part of any candidate key. This ensures that dependencies flow directly from keys to non-key attributes, with no intermediate steps that could introduce update anomalies.