Relational Database Cheatsheet

Shared 1/16/2026•555 views

/ 1

Cheatsheet Content

### Functional Dependencies (FDs) A functional dependency $X \to Y$ between two sets of attributes $X$ and $Y$ in a relation $R$ means that for any two tuples in $R$, if they agree on the values of all attributes in $X$, then they must also agree on the values of all attributes in $Y$. - $X$ is the **determinant**, $Y$ is the **dependent**. - If $X \to Y$ and $Y \to X$, then $X \leftrightarrow Y$. #### Normal Forms Normal forms are rules for database design that prevent anomalies (insertion, deletion, update) and data redundancy. ##### First Normal Form (1NF) A relation is in 1NF if and only if all attribute values are atomic (indivisible) and single-valued. - No multi-valued attributes. - No composite attributes that are not broken down. - Each column contains values of the same type. ##### Second Normal Form (2NF) A relation is in 2NF if it is in 1NF and every non-key attribute is fully functionally dependent on the primary key. - No partial dependencies: A non-key attribute cannot be dependent on only a part of a composite primary key. - **Example:** `(StudentID, CourseID) -> Grade`, `StudentID -> StudentName`. Here, `StudentName` is partially dependent on `StudentID` (part of the composite key). To achieve 2NF, decompose into `(StudentID, CourseID, Grade)` and `(StudentID, StudentName)`. ##### Third Normal Form (3NF) A relation is in 3NF if it is in 2NF and there are no transitive dependencies. - No transitive dependencies: A non-key attribute cannot be dependent on another non-key attribute. If $A \to B$ and $B \to C$, then $A \to C$ is a transitive dependency. 3NF states that if $A \to B$ and $B \to C$, then either $B$ must be a superkey or $C$ must be a prime attribute (part of *some* candidate key). - **Example:** `StudentID -> DepartmentID`, `DepartmentID -> DepartmentName`. Here, `DepartmentName` is transitively dependent on `StudentID` via `DepartmentID`. To achieve 3NF, decompose into `(StudentID, DepartmentID)` and `(DepartmentID, DepartmentName)`. ### Armstrong’s Axioms & Canonical Cover #### Armstrong’s Axioms A set of inference rules used to infer all FDs that logically follow from a given set of FDs (called a cover). 1. **Reflexivity:** If $Y \subseteq X$, then $X \to Y$. (If $Y$ is a subset of $X$, then $X$ determines $Y$.) 2. **Augmentation:** If $X \to Y$, then $XZ \to YZ$ (for any $Z$). (Adding attributes to the determinant does not invalidate the dependency.) 3. **Transitivity:** If $X \to Y$ and $Y \to Z$, then $X \to Z$. (If $X$ determines $Y$ and $Y$ determines $Z$, then $X$ determines $Z$.) **Derived Rules:** - **Decomposition:** If $X \to YZ$, then $X \to Y$ and $X \to Z$. - **Union:** If $X \to Y$ and $X \to Z$, then $X \to YZ$. - **Pseudotransitivity:** If $X \to Y$ and $YW \to Z$, then $XW \to Z$. #### Closure of a Set of Attributes ($X^+$) The set of all attributes that are functionally determined by $X$, given a set of FDs $F$. - Used to find candidate keys and check dependencies. #### Canonical Cover ($F_c$) A minimal set of functional dependencies that is equivalent to a given set of FDs $F$. - No redundant FDs: If an FD $X \to A$ is in $F_c$, it cannot be inferred from $F_c - \{X \to A\}$. - No redundant attributes in the left-hand side (determinant) of any FD: If $XY \to A$ is in $F_c$, $X \to A$ cannot be inferred from $F_c$. - All FDs in $F_c$ have a single attribute on the right-hand side. **Algorithm to find Canonical Cover:** 1. **Split right-hand sides:** Replace $X \to YZ$ with $X \to Y$ and $X \to Z$. 2. **Remove redundant FDs:** For each $FD_i: X \to A$ in $F$, check if $A \in (X)^+_{F - \{FD_i\}}$. If yes, remove $FD_i$. 3. **Remove redundant attributes from left-hand sides:** For each $FD_i: X \to A$ in $F$ and for each attribute $B \in X$, check if $A \in (X - \{B\})^+_F$. If yes, replace $X \to A$ with $(X - \{B\}) \to A$. ### Decomposition and its Properties Decomposition is the process of breaking down a relation into smaller relations to eliminate redundancy and anomalies. #### Desirable Properties of Decomposition: 1. **Lossless-join Decomposition:** - Ensures that no spurious (extra) tuples are generated when the decomposed relations are joined back together. - If a relation $R$ is decomposed into $R_1$ and $R_2$, the decomposition is lossless if $R = R_1 \bowtie R_2$. - **Condition:** The decomposition of $R(A, B, C)$ into $R_1(A, B)$ and $R_2(A, C)$ is lossless-join if $A \to B$ or $A \to C$ holds (where $A$ is the common attribute). - More generally, for a decomposition of $R$ into $R_1, ..., R_n$, it's lossless if applying the natural join operation on all $R_i$ reconstructs the original relation $R$. 2. **Dependency-preserving Decomposition:** - Ensures that all functional dependencies of the original relation can be enforced by simply enforcing the functional dependencies in the decomposed relations. - No need to perform joins to check dependencies. - If $F$ is the set of FDs for $R$, and $F_i$ is the set of FDs for $R_i$, then the union of $F_i$ should be equivalent to $F$. More precisely, $(F_1 \cup F_2 \cup ... \cup F_n)^+ = F^+$. - Achieving both lossless-join and dependency-preserving decomposition simultaneously is often a goal for 3NF. ### Boyce–Codd Normal Form (BCNF) A relation is in BCNF if and only if for every non-trivial functional dependency $X \to Y$, $X$ is a superkey of the relation. - A superkey is any set of attributes that uniquely identifies a tuple. - BCNF is a stricter form of 3NF. If a relation is in BCNF, it is also in 3NF. - **Difference from 3NF:** 3NF allows $X \to Y$ if $Y$ is a prime attribute (part of *some* candidate key), even if $X$ is not a superkey. BCNF does not allow this exception. - **Problem with BCNF:** Sometimes, achieving BCNF might lead to a lossy decomposition or a decomposition that is not dependency-preserving. If dependency preservation is critical, 3NF might be preferred. ### Fourth and Fifth Normal Form ##### Fourth Normal Form (4NF) A relation is in 4NF if and only if it is in BCNF and contains no multi-valued dependencies (MVDs). - **Multi-valued Dependency (MVD):** $A \twoheadrightarrow B$ means that for a given value of $A$, there is a set of values for $B$, and this set is independent of the values of other attributes. - MVDs arise when two or more independent multi-valued facts about an entity are stored in the same relation. - **Decomposition for 4NF:** If $A \twoheadrightarrow B$ is a non-trivial MVD, decompose the relation into two relations: one containing $A$ and $B$, and another containing $A$ and the remaining attributes. ##### Fifth Normal Form (5NF) / Project-Join Normal Form (PJNF) A relation is in 5NF if and only if it is in 4NF and contains no join dependencies that are not implied by the candidate keys. - **Join Dependency (JD):** A relation $R$ satisfies a join dependency $JD(R_1, R_2, ..., R_n)$ if $R = R_1 \bowtie R_2 \bowtie ... \bowtie R_n$. - 5NF deals with cases where a relation can be decomposed into smaller relations and then rejoined without loss of information, even though no MVDs or FDs are violated. This usually happens with complex many-to-many relationships. - Most database designs do not go beyond 3NF or BCNF, as 4NF and 5NF are rarely necessary and can introduce too many tables, increasing query complexity.

Related Cheatsheets

Create Your Own AI Cheatsheet

Generate comprehensive study cheatsheets from your notes, textbooks, or lecture materials using AI.

Relational Database Cheatsheet

Related Cheatsheets

PHP Cheatsheet

Linear Algebra Cheatsheet

BJT & PN Junctions Cheatsheet (continued)

Integration Cheatsheet

General Cheatsheet

DBMS & SQL Cheatsheet

Create Your Own AI Cheatsheet