Lesson 05: Functional Dependencies

Lesson 05: Functional Dependencies

Previous: 04_ER_Modeling.md | Next: 06_Normalization.md


Topic: Database Theory Lesson: 5 of 16 Prerequisites: Relational model concepts (relations, tuples, attributes, keys), basic set theory Objective: Understand functional dependencies as the formal foundation for database normalization, master Armstrong's axioms, compute attribute closures, derive candidate keys, and compute minimal covers

1. Introduction

Functional dependencies (FDs) are the most important concept in relational database design theory. They formalize the notion of "one attribute uniquely determines another" and provide the mathematical foundation for normalization β€” the process of organizing a database to reduce redundancy and prevent anomalies.

Before we had functional dependencies, database designers relied on intuition and experience. FDs gave us a rigorous framework to reason about what constitutes a "good" schema, and algorithms to systematically achieve it.

1.1 Why Functional Dependencies Matter

Consider a university database with a single relation:

StudentCourse(student_id, student_name, dept_name, course_id, course_title, grade)

Intuitively, we know: - A student_id uniquely identifies a student_name and dept_name - A course_id uniquely identifies a course_title - The combination (student_id, course_id) uniquely identifies the grade

These intuitive observations are precisely functional dependencies. They allow us to:

  1. Identify redundancy: If student_name appears in every row for the same student_id, we have redundancy
  2. Detect anomalies: Updating a student's department in one row but not others creates inconsistency
  3. Guide decomposition: FDs tell us exactly how to split a relation into smaller, well-structured pieces
  4. Verify key constraints: FDs formally define what it means for an attribute set to be a key

2. Formal Definition

2.1 Functional Dependency

Let R be a relation schema with attribute set U, and let X, Y βŠ† U.

Definition: A functional dependency X β†’ Y holds on R if and only if for every legal instance r of R, whenever two tuples t₁ and tβ‚‚ agree on the attributes in X, they must also agree on the attributes in Y. Formally:

X β†’ Y ⟺ βˆ€ t₁, tβ‚‚ ∈ r : t₁[X] = tβ‚‚[X] ⟹ t₁[Y] = tβ‚‚[Y]

Here: - X is called the determinant (or left-hand side, LHS) - Y is called the dependent (or right-hand side, RHS) - We read X β†’ Y as "X functionally determines Y" or "Y is functionally dependent on X"

2.2 Trivial and Non-Trivial FDs

Definition: A functional dependency X β†’ Y is trivial if Y βŠ† X.

Examples: - {student_id, course_id} β†’ {student_id} is trivial - {student_id} β†’ {student_id} is trivial - {student_id} β†’ {student_name} is non-trivial (student_name βˆ‰ {student_id})

Definition: A functional dependency X β†’ Y is completely non-trivial if X ∩ Y = βˆ….

2.3 Examples

Consider the relation schema:

Employee(emp_id, emp_name, dept_id, dept_name, salary, manager_id)

Typical functional dependencies: - emp_id β†’ emp_name, dept_id, salary, manager_id (employee ID determines all employee attributes) - dept_id β†’ dept_name (department ID determines department name) - emp_id β†’ dept_name (transitively, through dept_id)

Important: FDs are semantic constraints. They are determined by the meaning of the data in the real world, not by examining a particular instance. Looking at a snapshot of data can only disprove an FD (by finding a counterexample), never prove one holds in general.

2.4 FDs vs. Keys

There is a deep connection between functional dependencies and keys:

Definition: A set of attributes K is a superkey of relation R if K β†’ U, where U is the set of all attributes of R.

Definition: A set of attributes K is a candidate key of R if: 1. K β†’ U (K is a superkey), and 2. No proper subset of K functionally determines U (K is minimal)

Thus, keys are simply functional dependencies whose right-hand side is the entire attribute set, with a minimality condition.


3. Armstrong's Axioms

In 1974, William W. Armstrong proposed a set of inference rules for functional dependencies. These axioms are sound (they only derive correct FDs) and complete (they can derive all FDs that logically follow from a given set).

3.1 The Three Axioms

Let F be a set of functional dependencies on relation schema R, and let X, Y, Z βŠ† attributes(R).

Axiom 1: Reflexivity

If Y βŠ† X, then X β†’ Y.

This axiom generates all trivial FDs. For example: - {A, B, C} β†’ {A, B} - {A, B, C} β†’ {A} - {A} β†’ {A}

Proof of soundness: If Y βŠ† X and t₁[X] = tβ‚‚[X], then since Y βŠ† X, we have t₁[Y] = tβ‚‚[Y]. ∎

Axiom 2: Augmentation

If X β†’ Y, then XZ β†’ YZ for any Z.

We can "augment" both sides of an FD with any set of attributes. For example: - If A β†’ B, then AC β†’ BC - If AB β†’ C, then ABD β†’ CD

Proof of soundness: Suppose X β†’ Y and t₁[XZ] = tβ‚‚[XZ]. Then t₁[X] = tβ‚‚[X] and t₁[Z] = tβ‚‚[Z]. Since X β†’ Y, we have t₁[Y] = tβ‚‚[Y]. Combining: t₁[YZ] = tβ‚‚[YZ]. ∎

Axiom 3: Transitivity

If X β†’ Y and Y β†’ Z, then X β†’ Z.

This allows chaining of functional dependencies. For example: - If student_id β†’ dept_id and dept_id β†’ dept_name, then student_id β†’ dept_name

Proof of soundness: Suppose X β†’ Y and Y β†’ Z. Let t₁[X] = tβ‚‚[X]. By X β†’ Y, t₁[Y] = tβ‚‚[Y]. By Y β†’ Z, t₁[Z] = tβ‚‚[Z]. Therefore X β†’ Z. ∎

3.2 Soundness and Completeness

Theorem (Armstrong, 1974): Armstrong's axioms are sound and complete. - Soundness: Every FD derivable from F using these axioms is logically implied by F. - Completeness: Every FD logically implied by F can be derived using these axioms.

The completeness proof is non-trivial. The key idea is to show that if X β†’ Y cannot be derived from F using Armstrong's axioms, then there exists a two-tuple relation that satisfies F but violates X β†’ Y.


4. Derived Inference Rules

From Armstrong's three axioms, we can derive several useful additional rules.

4.1 Union Rule

If X β†’ Y and X β†’ Z, then X β†’ YZ.

Proof: 1. X β†’ Y (given) 2. X β†’ Z (given) 3. X β†’ XY (augment step 1 with X; since XX = X, we get X β†’ XY) 4. XY β†’ YZ (augment step 2 with Y) 5. X β†’ YZ (transitivity on steps 3 and 4) ∎

4.2 Decomposition Rule

If X β†’ YZ, then X β†’ Y and X β†’ Z.

Proof: 1. X β†’ YZ (given) 2. YZ β†’ Y (reflexivity, since Y βŠ† YZ) 3. X β†’ Y (transitivity on steps 1 and 2) 4. YZ β†’ Z (reflexivity, since Z βŠ† YZ) 5. X β†’ Z (transitivity on steps 1 and 4) ∎

4.3 Pseudotransitivity

If X β†’ Y and WY β†’ Z, then WX β†’ Z.

Proof: 1. X β†’ Y (given) 2. WX β†’ WY (augment step 1 with W) 3. WY β†’ Z (given) 4. WX β†’ Z (transitivity on steps 2 and 3) ∎

4.4 Self-Determination

X β†’ X for any X.

This follows directly from reflexivity (X βŠ† X).

4.5 Accumulation

If X β†’ YZ and Z β†’ BW, then X β†’ YBW.

Proof: 1. X β†’ YZ (given) 2. X β†’ Z (decomposition of step 1) 3. Z β†’ BW (given) 4. X β†’ BW (transitivity on steps 2 and 3) 5. X β†’ Y (decomposition of step 1) 6. X β†’ YBW (union of steps 4 and 5) ∎

4.6 Summary of Rules

Rule Statement Derived From
Reflexivity Y βŠ† X ⟹ X β†’ Y Axiom
Augmentation X β†’ Y ⟹ XZ β†’ YZ Axiom
Transitivity X β†’ Y, Y β†’ Z ⟹ X β†’ Z Axiom
Union X β†’ Y, X β†’ Z ⟹ X β†’ YZ Aug + Trans
Decomposition X β†’ YZ ⟹ X β†’ Y, X β†’ Z Refl + Trans
Pseudotransitivity X β†’ Y, WY β†’ Z ⟹ WX β†’ Z Aug + Trans

5. Closure of an Attribute Set

Computing the closure of an attribute set is the most fundamental algorithm in FD theory. It answers the question: "Given a set of FDs F, what attributes are functionally determined by a given set X?"

5.1 Definition

Definition: The closure of an attribute set X under a set of functional dependencies F, denoted X⁺ (or X⁺_F), is the set of all attributes A such that X β†’ A can be inferred from F using Armstrong's axioms.

X⁺ = { A ∈ U | F ⊨ X β†’ A }

5.2 Algorithm

The following algorithm computes X⁺ efficiently:

ALGORITHM: ComputeClosure(X, F)
INPUT:  X = a set of attributes
        F = a set of functional dependencies
OUTPUT: X⁺ = the closure of X under F

1.  result ← X
2.  REPEAT
3.      old_result ← result
4.      FOR EACH dependency (V β†’ W) IN F DO
5.          IF V βŠ† result THEN
6.              result ← result βˆͺ W
7.          END IF
8.      END FOR
9.  UNTIL result = old_result
10. RETURN result

Time complexity: O(|F| Γ— |U|) in the worst case, where |F| is the number of FDs and |U| is the number of attributes.

5.3 Worked Example 1

Let R(A, B, C, D, E, F) with the following FDs:

F = { A β†’ BC,  CD β†’ E,  B β†’ D,  E β†’ A }

Compute {A}⁺:

Iteration result Applied FD New Attributes
Init {A} β€” β€”
1 {A, B, C} A β†’ BC B, C
2 {A, B, C, D} B β†’ D D
3 {A, B, C, D, E} CD β†’ E E
4 {A, B, C, D, E} E β†’ A (A already in result) β€”

Since no new attributes were added in iteration 4, we stop.

{A}⁺ = {A, B, C, D, E}

Note: F is not in the closure, so A is not a superkey of R (it doesn't determine F).

Compute {A, F}⁺:

Starting with {A, F}, after the same iterations plus F already present:

{A, F}⁺ = {A, B, C, D, E, F} = all attributes of R.

Therefore, {A, F} is a superkey.

5.4 Worked Example 2

Let R(A, B, C, D, E) with:

F = { AB β†’ C,  C β†’ D,  D β†’ E,  E β†’ A }

Compute {B, C}⁺:

Iteration result Applied FD New Attributes
Init {B, C} β€” β€”
1 {B, C, D} C β†’ D D
2 {B, C, D, E} D β†’ E E
3 {A, B, C, D, E} E β†’ A A
4 {A, B, C, D, E} AB β†’ C (C already present) β€”

{B, C}⁺ = {A, B, C, D, E} β€” so {B, C} is a superkey.

5.5 Uses of Attribute Closure

The closure algorithm has three main applications:

  1. Testing if X β†’ Y holds: Compute X⁺. If Y βŠ† X⁺, then X β†’ Y holds.

  2. Testing if X is a superkey: Compute X⁺. If X⁺ = U (all attributes), then X is a superkey.

  3. Computing F⁺ (the closure of F): For each subset X βŠ† U, compute X⁺ and output X β†’ Y for each Y βŠ† X⁺. (This is exponential and rarely practical, but it's theoretically important.)


6. Closure of a Set of FDs

6.1 Definition

Definition: The closure of a set of functional dependencies F, denoted F⁺, is the set of all functional dependencies that can be logically inferred from F.

F⁺ = { X β†’ Y | F ⊨ X β†’ Y }

F⁺ can be extremely large. For a relation with n attributes, there are 2ⁿ possible subsets for X and 2ⁿ for Y, giving up to 2²ⁿ possible FDs. In practice, we almost never compute F⁺ directly; we use the attribute closure algorithm instead.

6.2 Equivalence of FD Sets

Definition: Two sets of FDs F and G are equivalent (denoted F ≑ G) if F⁺ = G⁺.

To test whether F ≑ G: 1. Check if every FD in G can be derived from F: for each X β†’ Y in G, verify Y βŠ† X⁺_F 2. Check if every FD in F can be derived from G: for each X β†’ Y in F, verify Y βŠ† X⁺_G

If both checks pass, F ≑ G.

6.3 Example

F = { A β†’ B, B β†’ C }
G = { A β†’ B, B β†’ C, A β†’ C }

F ≑ G because A β†’ C is derivable from F by transitivity, and all FDs in F are trivially in G.


7. Minimal Cover (Canonical Cover)

A minimal cover is a "simplified" version of a set of FDs β€” it removes redundancy while preserving the same logical content. This is essential for normalization algorithms.

7.1 Definition

Definition: A set of FDs F_min is a minimal cover (or canonical cover) of F if: 1. Equivalence: F_min ≑ F (same closure) 2. Single attribute on RHS: Every FD in F_min has the form X β†’ A where A is a single attribute 3. No redundant FDs: Removing any FD from F_min changes the closure 4. No extraneous attributes on LHS: For each FD X β†’ A in F_min, no proper subset of X functionally determines A under F_min

7.2 Algorithm

ALGORITHM: MinimalCover(F)
INPUT:  F = a set of functional dependencies
OUTPUT: F_min = a minimal cover of F

Step 1: DECOMPOSE right-hand sides
    Replace each FD X β†’ {A₁, Aβ‚‚, ..., Aβ‚™} with
    X β†’ A₁, X β†’ Aβ‚‚, ..., X β†’ Aβ‚™

Step 2: REMOVE extraneous attributes from left-hand sides
    FOR EACH FD (X β†’ A) IN F where |X| > 1 DO
        FOR EACH attribute B IN X DO
            IF A ∈ closure(X - {B}, F) THEN
                Replace (X β†’ A) with ((X - {B}) β†’ A)
            END IF
        END FOR
    END FOR

Step 3: REMOVE redundant FDs
    FOR EACH FD (X β†’ A) IN F DO
        IF A ∈ closure(X, F - {X β†’ A}) THEN
            Remove (X β†’ A) from F
        END IF
    END FOR

RETURN F

Important: Step 2 must come before Step 3. If we remove redundant FDs first, some attributes that were extraneous might no longer appear to be extraneous.

7.3 Worked Example

Let R(A, B, C, D) with:

F = { A β†’ BC,  B β†’ C,  AB β†’ D,  D β†’ A }

Step 1: Decompose RHS

F = { A β†’ B,  A β†’ C,  B β†’ C,  AB β†’ D,  D β†’ A }

Step 2: Remove extraneous LHS attributes

Check AB β†’ D: Can we remove A or B? - Check if B alone determines D: Compute {B}⁺ under current F: - {B} β†’ {B, C} (via B β†’ C) β€” no more. D βˆ‰ {B}⁺. So B alone doesn't work; keep A. - Check if A alone determines D: Compute {A}⁺ under current F: - {A} β†’ {A, B, C} (via A β†’ B, A β†’ C) β†’ {A, B, C, D} (via AB β†’ D, since A and B are both present) - D ∈ {A}⁺. So A is extraneous in AB β†’ D!

Replace AB β†’ D with A β†’ D:

F = { A β†’ B,  A β†’ C,  B β†’ C,  A β†’ D,  D β†’ A }

Step 3: Remove redundant FDs

Check A β†’ B: Remove it. Compute {A}⁺ under F - {A β†’ B}: - F - {A β†’ B} = { A β†’ C, B β†’ C, A β†’ D, D β†’ A } - {A}⁺ = {A, C, D} (via A β†’ C, A β†’ D, D β†’ A). B βˆ‰ {A}⁺. So A β†’ B is NOT redundant. Keep it.

Check A β†’ C: Remove it. Compute {A}⁺ under F - {A β†’ C}: - {A}⁺: A β†’ B gives B, B β†’ C gives C. C ∈ {A}⁺. So A β†’ C IS redundant. Remove it.

F = { A β†’ B,  B β†’ C,  A β†’ D,  D β†’ A }

Check B β†’ C: Remove it. Compute {B}⁺ under F - {B β†’ C}: - {B}⁺ = {B}. C βˆ‰ {B}⁺. Not redundant. Keep it.

Check A β†’ D: Remove it. Compute {A}⁺ under F - {A β†’ D}: - {A}⁺ = {A, B, C}. D βˆ‰ {A}⁺. Not redundant. Keep it.

Check D β†’ A: Remove it. Compute {D}⁺ under F - {D β†’ A}: - {D}⁺ = {D}. A βˆ‰ {D}⁺. Not redundant. Keep it.

Minimal cover:

F_min = { A β†’ B,  B β†’ C,  A β†’ D,  D β†’ A }

7.4 Non-Uniqueness of Minimal Covers

A minimal cover is not unique. Different orderings in Step 2 or Step 3 can produce different (but equivalent) minimal covers. For example, in Step 2, checking attributes in different orders can lead to different simplifications.


8. Finding Candidate Keys Using FDs

8.1 Attribute Classification

To find candidate keys efficiently, we first classify attributes:

Category Definition Role in Keys
L-only Appears only on the LHS of FDs (never on RHS) Must be in every key
R-only Appears only on the RHS (never on LHS) Never in any key
Both Appears on both LHS and RHS May or may not be in a key
Neither Appears in no FD at all Must be in every key

8.2 Algorithm for Finding Candidate Keys

ALGORITHM: FindCandidateKeys(R, F)
INPUT:  R = relation schema, F = set of FDs
OUTPUT: Set of all candidate keys

Step 1: Classify attributes into L-only, R-only, Both, Neither.
Step 2: Let CORE = L-only βˆͺ Neither.
        (CORE must be part of every candidate key.)
Step 3: Compute CORE⁺.
        If CORE⁺ = all attributes, then CORE is the only candidate key. DONE.
Step 4: Otherwise, try adding subsets of "Both" attributes to CORE.
        Start with single attributes, then pairs, etc.
        For each subset S of "Both":
            Compute (CORE βˆͺ S)⁺
            If it equals all attributes and no proper subset of (CORE βˆͺ S)
            containing CORE is also a superkey, then CORE βˆͺ S is a candidate key.

8.3 Worked Example

Let R(A, B, C, D, E, F) with:

F = { AB β†’ C,  C β†’ D,  D β†’ E,  CF β†’ B }

Step 1: Classify attributes

Attribute LHS? RHS? Category
A Yes (AB→C) No L-only
B Yes (AB→C) Yes (CF→B) Both
C Yes (C→D, CF→B) Yes (AB→C) Both
D Yes (D→E) Yes (C→D) Both
E No Yes (D→E) R-only
F Yes (CF→B) No L-only

Step 2: CORE = {A, F} (L-only attributes; no "Neither" attributes)

Step 3: Compute {A, F}⁺

  • Start: {A, F}
  • No FD has LHS βŠ† {A, F} (AB needs B, Cβ†’D needs C, etc.)
  • {A, F}⁺ = {A, F} β‰  all attributes

Step 4: Try adding "Both" attributes (B, C, D)

Try adding B: {A, B, F}⁺ - AB β†’ C: {A, B, F, C} - C β†’ D: {A, B, C, D, F} - D β†’ E: {A, B, C, D, E, F} = all attributes! - {A, B, F} is a superkey. Check minimality: CORE = {A, F} alone doesn't work. So {A, B, F} is a candidate key.

Try adding C: {A, C, F}⁺ - CF β†’ B: {A, B, C, F} - AB β†’ C: already have C - C β†’ D: {A, B, C, D, F} - D β†’ E: {A, B, C, D, E, F} = all attributes! - {A, C, F} is a superkey. Check minimality: {A, F} alone doesn't work. So {A, C, F} is a candidate key.

Try adding D: {A, D, F}⁺ - D β†’ E: {A, D, E, F} - No more applicable FDs. {A, D, E, F} β‰  all attributes. Not a superkey.

Candidate keys: {A, B, F} and {A, C, F}

We don't need to check pairs (like {B, C}) since we already found candidate keys with single additions and they are minimal.


9. Entailment and Implication

9.1 Logical Implication

Definition: A set of FDs F logically implies an FD X β†’ Y (written F ⊨ X β†’ Y) if every relation instance that satisfies all FDs in F also satisfies X β†’ Y.

9.2 Testing Implication

To test if F ⊨ X β†’ Y: 1. Compute X⁺ under F 2. If Y βŠ† X⁺, then F ⊨ X β†’ Y

This is the practical workhorse: instead of reasoning through chains of axiom applications, just run the closure algorithm.

9.3 Example

Given F = { A β†’ B, B β†’ C, CD β†’ E }:

Does F ⊨ AD β†’ E?

Compute {A, D}⁺: - A β†’ B: {A, B, D} - B β†’ C: {A, B, C, D} - CD β†’ E: {A, B, C, D, E}

Since E ∈ {A, D}⁺, yes, F ⊨ AD β†’ E. βœ“

Does F ⊨ A β†’ E?

Compute {A}⁺: - A β†’ B: {A, B} - B β†’ C: {A, B, C} - No more applicable FDs.

E βˆ‰ {A}⁺ = {A, B, C}. So F ⊭ A β†’ E. βœ—


10. FDs in Practice

10.1 Identifying FDs from Requirements

Real-world FDs come from business rules and domain knowledge:

Business Rule Functional Dependency
"Each employee has exactly one department" emp_id β†’ dept_id
"Each department has exactly one name" dept_id β†’ dept_name
"Each student gets one grade per course" {student_id, course_id} β†’ grade
"Each ISBN identifies one book title" isbn β†’ title
"A flight on a given date has one pilot" {flight_num, date} β†’ pilot_id

10.2 FDs and NULL Values

Standard FD theory assumes no NULL values. In practice: - NULLs complicate FD reasoning (NULL β‰  NULL in SQL) - SQL's UNIQUE constraint allows multiple NULLs (except for PRIMARY KEY) - Some database systems offer "NULLS NOT DISTINCT" to treat NULLs as equal for uniqueness checks

10.3 Discovering FDs from Data

While FDs are semantic constraints (determined by domain knowledge, not data), there are algorithms for discovering approximate FDs from data:

  • TANE algorithm: Discovers all FDs holding in a dataset
  • FUN algorithm: Uses lattice-based search
  • FDTool: Practical tool for FD discovery

These are useful for reverse-engineering poorly documented databases, but the discovered FDs should always be validated against domain knowledge.

10.4 FDs in SQL

SQL doesn't have a direct FUNCTIONAL DEPENDENCY constraint, but FDs are enforced through:

-- Primary key enforces: emp_id β†’ all other attributes
CREATE TABLE Employee (
    emp_id    INT PRIMARY KEY,
    emp_name  VARCHAR(100),
    dept_id   INT,
    salary    DECIMAL(10,2)
);

-- UNIQUE constraint enforces: email β†’ (implicitly all attributes, if it's a key)
ALTER TABLE Employee ADD CONSTRAINT uq_email UNIQUE(email);

-- The FD dept_id β†’ dept_name is enforced by having a separate Departments table
-- with dept_id as its primary key
CREATE TABLE Department (
    dept_id   INT PRIMARY KEY,
    dept_name VARCHAR(100)
);

11. Common Pitfalls

11.1 Confusing FDs with Data Patterns

A common mistake is looking at data and concluding an FD exists:

| city        | state |
|-------------|-------|
| Springfield | IL    |
| Portland    | OR    |
| Austin      | TX    |

This data is consistent with city β†’ state, but in reality, many cities share names across states (Springfield exists in over 30 states). The FD city β†’ state does not hold.

11.2 Direction Matters

X β†’ Y does not imply Y β†’ X.

  • dept_id β†’ dept_name (a department has one name) βœ“
  • dept_name β†’ dept_id (a name identifies one department) β€” depends on whether names are unique!

11.3 FD on a Single Attribute

An FD like {} β†’ A (the empty set determines A) means A has the same value in every tuple. This is a rare but valid FD (e.g., a table where all employees are in the same company: {} β†’ company_name).

11.4 Order of Operations in Minimal Cover

Step 2 (remove extraneous LHS attributes) must precede Step 3 (remove redundant FDs). Reversing the order can produce incorrect results.


12. Proofs and Theory

12.1 Proving Completeness of Armstrong's Axioms (Sketch)

Claim: If F ⊭ X β†’ Y using Armstrong's axioms, then there exists a relation instance satisfying F but violating X β†’ Y.

Proof sketch: Construct a two-tuple relation r = {t₁, tβ‚‚} where: - t₁[A] = tβ‚‚[A] = 1 for all A ∈ X⁺ - t₁[A] = 1, tβ‚‚[A] = 0 for all A βˆ‰ X⁺

We need to verify: 1. r satisfies every FD in F: For any V β†’ W in F, if t₁[V] = tβ‚‚[V], then V βŠ† X⁺, so W βŠ† X⁺ (by the closure algorithm), so t₁[W] = tβ‚‚[W]. βœ“ 2. r violates X β†’ Y (assuming Y βŠ„ X⁺): Since t₁[X] = tβ‚‚[X] (all 1's) but there exists some A ∈ Y with A βˆ‰ X⁺, so t₁[A] β‰  tβ‚‚[A]. βœ“

This contradicts the assumption that X β†’ Y is logically implied by F, proving completeness. ∎

12.2 Complexity Results

Problem Complexity
Computing X⁺ O(|F| Γ— |U|) β€” polynomial
Testing if F ⊨ X β†’ Y O(|F| Γ— |U|) β€” polynomial
Computing F⁺ Exponential (can be 2^(2n))
Finding all candidate keys NP-complete in general
Computing minimal cover Polynomial
Testing if X is a superkey O(|F| Γ— |U|) β€” polynomial

13. Exercises

Exercise 1: Attribute Closure

Let R(A, B, C, D, E) with F = { AB β†’ C, C β†’ D, BD β†’ E, A β†’ B }.

Compute the following closures: 1. {A}⁺ 2. {B, C}⁺ 3. {A, D}⁺ 4. {C, D}⁺

Solution 1. **{A}⁺**: A β†’ B gives {A, B}; AB β†’ C gives {A, B, C}; C β†’ D gives {A, B, C, D}; BD β†’ E gives {A, B, C, D, E}. **{A}⁺ = {A, B, C, D, E}** 2. **{B, C}⁺**: C β†’ D gives {B, C, D}; BD β†’ E gives {B, C, D, E}. No more. **{B, C}⁺ = {B, C, D, E}** 3. **{A, D}⁺**: A β†’ B gives {A, B, D}; AB β†’ C gives {A, B, C, D}; BD β†’ E gives {A, B, C, D, E}. **{A, D}⁺ = {A, B, C, D, E}** 4. **{C, D}⁺**: C β†’ D (already have D). No other FD has LHS βŠ† {C, D}. **{C, D}⁺ = {C, D}**

Exercise 2: Finding Candidate Keys

For the relation and FDs in Exercise 1, find all candidate keys.

Solution Classify attributes: - A: LHS only (in ABβ†’C, Aβ†’B) β†’ L-only - B: Both (LHS in ABβ†’C, BDβ†’E; RHS in Aβ†’B) - C: Both (LHS in Cβ†’D; RHS in ABβ†’C) - D: Both (LHS in BDβ†’E; RHS in Cβ†’D) - E: RHS only (in BDβ†’E) β†’ R-only CORE = {A} (L-only; no "Neither" attributes). {A}⁺ = {A, B, C, D, E} = all attributes. **{A} is the only candidate key.**

Exercise 3: Minimal Cover

Find a minimal cover for:

F = { A β†’ BC,  B β†’ C,  AB β†’ D,  D β†’ BC }
Solution **Step 1: Decompose RHS**
F = { A β†’ B, A β†’ C, B β†’ C, AB β†’ D, D β†’ B, D β†’ C }
**Step 2: Remove extraneous LHS attributes** Check AB β†’ D: - Try removing A: {B}⁺ = {B, C}. D βˆ‰ {B}⁺. Keep A. - Try removing B: {A}⁺ = {A, B, C, D} (via Aβ†’B, Aβ†’C, then ABβ†’D since B now included, then Dβ†’B, Dβ†’C). D ∈ {A}⁺. Remove B! Replace AB β†’ D with A β†’ D.
F = { A β†’ B, A β†’ C, B β†’ C, A β†’ D, D β†’ B, D β†’ C }
**Step 3: Remove redundant FDs** - A β†’ B: Remove. {A}⁺ under F - {Aβ†’B} = {A, C, D, B, C} (Aβ†’C, Aβ†’D, Dβ†’B, Dβ†’C). B ∈ {A}⁺. **Redundant! Remove.** - A β†’ C: Remove. {A}⁺ under F - {Aβ†’C} = {A, D, B, C} (Aβ†’D, Dβ†’B, Dβ†’C). C ∈ {A}⁺. **Redundant! Remove.** - B β†’ C: Remove. {B}⁺ under F - {Bβ†’C} = {B}. C βˆ‰ {B}⁺. **Not redundant. Keep.** - A β†’ D: Remove. {A}⁺ under F - {Aβ†’D} = {A}. D βˆ‰ {A}⁺. **Not redundant. Keep.** - D β†’ B: Remove. {D}⁺ under F - {Dβ†’B} = {D, C}. B βˆ‰ {D}⁺. **Not redundant. Keep.** - D β†’ C: Remove. {D}⁺ under F - {Dβ†’C} = {D, B, C}. C ∈ {D}⁺ (via Dβ†’B, Bβ†’C). **Redundant! Remove.** **Minimal cover: F_min = { B β†’ C, A β†’ D, D β†’ B }**

Exercise 4: Proving an FD

Given F = { A β†’ B, B β†’ C, C β†’ D }, prove that A β†’ D using Armstrong's axioms. Write out each step explicitly.

Solution 1. A β†’ B (given) 2. B β†’ C (given) 3. A β†’ C (transitivity on steps 1 and 2) 4. C β†’ D (given) 5. A β†’ D (transitivity on steps 3 and 4) ∎

Exercise 5: FD Implication

Given F = { A β†’ B, BC β†’ D, E β†’ C }:

Determine whether the following FDs are implied by F: 1. AE β†’ D 2. BE β†’ D 3. A β†’ D

Solution 1. **AE β†’ D**: Compute {A, E}⁺ = {A, E} β†’ {A, B, E} (Aβ†’B) β†’ {A, B, C, E} (Eβ†’C) β†’ {A, B, C, D, E} (BCβ†’D). D ∈ {A,E}⁺. **Yes, F ⊨ AE β†’ D.** βœ“ 2. **BE β†’ D**: Compute {B, E}⁺ = {B, E} β†’ {B, C, E} (Eβ†’C) β†’ {B, C, D, E} (BCβ†’D). D ∈ {B,E}⁺. **Yes, F ⊨ BE β†’ D.** βœ“ 3. **A β†’ D**: Compute {A}⁺ = {A} β†’ {A, B} (Aβ†’B). No more applicable. D βˆ‰ {A}⁺. **No, F ⊭ A β†’ D.** βœ—

Exercise 6: Equivalence of FD Sets

Are the following two sets of FDs equivalent?

F = { A β†’ B, B β†’ C, A β†’ C }
G = { A β†’ B, B β†’ C }
Solution Check if every FD in F is implied by G: - A β†’ B: In G directly. βœ“ - B β†’ C: In G directly. βœ“ - A β†’ C: {A}⁺_G = {A, B, C}. C ∈ {A}⁺. βœ“ Check if every FD in G is implied by F: - A β†’ B: In F directly. βœ“ - B β†’ C: In F directly. βœ“ **F ≑ G.** The FD A β†’ C in F is redundant; it follows from A β†’ B and B β†’ C by transitivity.

Exercise 7: Multiple Candidate Keys

Let R(A, B, C, D, E) with F = { AB β†’ CDE, C β†’ A, D β†’ B }.

Find all candidate keys.

Solution Classify attributes: - A: Both (LHS: ABβ†’CDE; RHS: Cβ†’A) - B: Both (LHS: ABβ†’CDE; RHS: Dβ†’B) - C: Both (LHS: Cβ†’A; RHS: ABβ†’CDE) - D: Both (LHS: Dβ†’B; RHS: ABβ†’CDE) - E: RHS only β†’ R-only No L-only or Neither attributes, so CORE = {}. Try single attributes: - {A}⁺ = {A}. Not superkey. - {B}⁺ = {B}. Not superkey. - {C}⁺ = {C, A} = {A, C}. Not superkey. - {D}⁺ = {D, B} = {B, D}. Not superkey. Try pairs: - {A, B}⁺ = {A, B, C, D, E}. **Superkey!** Minimal (neither {A} nor {B} alone works). **Candidate key: {A, B}** - {A, D}⁺ = {A, B, D} β†’ {A, B, C, D, E}. **Superkey!** Check: {A}⁺={A}, {D}⁺={B,D}. Minimal. **Candidate key: {A, D}** - {B, C}⁺ = {A, B, C} β†’ {A, B, C, D, E}. **Superkey!** Check: {B}⁺={B}, {C}⁺={A,C}. Minimal. **Candidate key: {B, C}** - {C, D}⁺ = {A, B, C, D} β†’ {A, B, C, D, E}. **Superkey!** Check: {C}⁺={A,C}, {D}⁺={B,D}. Minimal. **Candidate key: {C, D}** **All candidate keys: {A,B}, {A,D}, {B,C}, {C,D}**

Exercise 8: Closure Proof

Prove that the Union Rule (X β†’ Y, X β†’ Z ⟹ X β†’ YZ) follows from Armstrong's axioms.

Solution 1. X β†’ Y (given) 2. X β†’ XY (augment step 1 with X: XX β†’ XY, and XX = X) 3. X β†’ Z (given) 4. XY β†’ YZ (augment step 3 with Y: XY β†’ ZY) 5. X β†’ YZ (transitivity on steps 2 and 4) ∎

14. Summary

Concept Key Point
Functional Dependency X β†’ Y means X uniquely determines Y
Armstrong's Axioms Reflexivity, Augmentation, Transitivity β€” sound and complete
Derived Rules Union, Decomposition, Pseudotransitivity
Attribute Closure X⁺ All attributes determined by X β€” the key algorithm
FD Set Closure F⁺ All FDs implied by F β€” usually computed via X⁺
Minimal Cover Simplified equivalent FD set β€” needed for normalization
Candidate Keys Found by classifying attributes and computing closures

Functional dependencies are the theoretical bedrock of relational database design. The algorithms in this lesson β€” attribute closure and minimal cover β€” will be used extensively in the next lesson on normalization, where we apply FDs to decompose relations into well-structured schemas.


Previous: 04_ER_Modeling.md | Next: 06_Normalization.md

to navigate between lessons