ER Modeling
ER Modeling¶
Previous: Relational Algebra | Next: Functional Dependencies
The Entity-Relationship (ER) model, introduced by Peter Chen in 1976, is the most widely used approach for conceptual database design. It provides a graphical notation for representing the structure of data at a high level of abstraction, independent of any particular DBMS. This lesson covers the ER model, its Enhanced version (EER), and the systematic algorithm for converting an ER diagram into a relational schema.
Table of Contents¶
- Conceptual Design Overview
- Entity Types and Entity Sets
- Attributes
- Relationship Types
- Cardinality Constraints
- Participation Constraints
- Weak Entities
- Enhanced ER (EER) Model
- ER-to-Relational Mapping Algorithm
- Design Case Study: University Database
- Common Pitfalls and Best Practices
- Exercises
1. Conceptual Design Overview¶
Database design follows a structured process from requirements to implementation:
ββββββββββββββββββββ
β Requirements β "What data do we need? What queries?"
β Analysis β
ββββββββββ¬ββββββββββ
β
βΌ
ββββββββββββββββββββ
β Conceptual β ER Diagram (DBMS-independent)
β Design β β THIS LESSON
ββββββββββ¬ββββββββββ
β
βΌ
ββββββββββββββββββββ
β Logical β Relational schema (tables, keys, constraints)
β Design β β ER-to-Relational Mapping
ββββββββββ¬ββββββββββ
β
βΌ
ββββββββββββββββββββ
β Physical β Indexes, storage, partitioning, SQL DDL
β Design β
ββββββββββ¬ββββββββββ
β
βΌ
ββββββββββββββββββββ
β Implementation β CREATE TABLE, INSERT, stored procedures
β & Tuning β
ββββββββββββββββββββ
Why Conceptual Design?¶
- Communication: ER diagrams are understandable by non-technical stakeholders
- Abstraction: Focus on data structure without worrying about implementation
- Correctness: Catch design errors early before writing SQL
- Documentation: Serves as a living blueprint of the data model
ER Diagram Notation¶
This lesson uses the original Chen notation (most common in textbooks):
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Symbol Legend β
β β
β βββββββββββ Entity type (strong) β
β β NAME β β
β βββββββββββ β
β β
β βββββββββββ Entity type (weak) β
β β NAME β β
β βββββββββββ β
β β
β β or ββββ Relationship type β
β <WORKS_FOR> β
β β
β (attribute) Attribute (oval) β
β ((derived)) Derived attribute (dashed oval) β
β {multivalued} Multivalued attribute (double oval) β
β β
β βββ single line Partial participation β
β βββ double line Total participation β
β β
β 1, N, M Cardinality markers β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
2. Entity Types and Entity Sets¶
Entity¶
An entity is a "thing" or object in the real world that is distinguishable from other objects. It can be physical (a person, a book) or conceptual (a course, a bank account).
Entity Type¶
An entity type defines a collection of entities that have the same attributes. It is like a class or template.
Entity Set¶
An entity set (or entity instance set) is the collection of all entities of a particular type at a given point in time. It is like the set of all objects of a class.
Entity Type: STUDENT
(defines the structure: sid, name, year, dept)
Entity Set: The current collection of student entities:
{(S001, Alice, 3, CS), (S002, Bob, 2, CS), ...}
Entity (Instance): A single student, e.g., (S001, Alice, 3, CS)
Notation in ER Diagrams¶
βββββββββββββ
β STUDENT β
βββββββββββββ
/ | | \
(sid) (name)(year)(dept)
[PK]
3. Attributes¶
Attributes describe properties of an entity type. There are several kinds:
Simple (Atomic) Attribute¶
A simple attribute cannot be divided into smaller components.
Examples:
- student_id: "S001"
- year: 3
- gpa: 3.85
Composite Attribute¶
A composite attribute can be divided into smaller sub-attributes.
(name)
/ \
(first_name) (last_name)
(address)
/ | \
(street) (city) (zip_code)
|
(state) (country)
Multivalued Attribute¶
A multivalued attribute can have multiple values for a single entity.
{phone_numbers} A student may have 0, 1, or many phone numbers.
{skills} An employee may have multiple skills.
{email_addresses} A person may have multiple email addresses.
Notation: Double oval or curly braces {attribute}
Derived Attribute¶
A derived attribute's value can be computed from other attributes.
((age)) Derived from date_of_birth and current date
((total_credits)) Derived from summing credits of enrolled courses
((employee_count)) Derived from counting employees in a department
Notation: Dashed oval or double parentheses ((attribute))
Key Attribute¶
A key attribute uniquely identifies each entity in the entity set.
For STUDENT: student_id (underlined in diagrams)
For COURSE: course_id
For EMPLOYEE: employee_id or ssn
Notation: Underlined attribute name
Composite Key¶
When no single attribute uniquely identifies an entity, a combination of attributes forms the key.
Example: ENROLLMENT might be identified by (student_id, course_id, semester)
NULL Values¶
Attributes may have NULL values when: - The value is not applicable (apartment number for a house) - The value is unknown (phone number not provided)
Attribute Summary¶
Attribute Types:
ββββββββββββββββ
ββββββββββββββββββΊβ Simple β
β β (atomic) β
ββββββββββββ β ββββββββββββββββ
β Structure ββββββββ€
ββββββββββββ β ββββββββββββββββ
ββββββββββββββββββΊβ Composite β
β (divisible) β
ββββββββββββββββ
ββββββββββββββββ
ββββββββββββββββββΊβ Single-valuedβ
β ββββββββββββββββ
ββββββββββββ β
β Cardinalityββββββ€
ββββββββββββ β ββββββββββββββββ
ββββββββββββββββββΊβ Multivalued β
β {attr} β
ββββββββββββββββ
ββββββββββββββββ
ββββββββββββββββββΊβ Stored β
β ββββββββββββββββ
ββββββββββββ β
β Source ββββββββ€
ββββββββββββ β ββββββββββββββββ
ββββββββββββββββββΊβ Derived β
β ((attr)) β
ββββββββββββββββ
ER Diagram with All Attribute Types¶
βββββββββββββ
β EMPLOYEE β
βββββββββββββ
/ | | | \ \
/ | | | \ \
/ | | | \ \
(emp_id) (name) | (hire_date) {phone} ((age))
[PK] / \ | Multi- Derived
/ \ (salary) valued
(first) (last)
Composite
emp_id: Simple, Key
name: Composite (first + last)
salary: Simple, Single-valued
hire_date: Simple, Stored
phone: Simple, Multivalued
age: Simple, Derived (from birth_date)
4. Relationship Types¶
A relationship type defines an association between entity types. A relationship instance is an association between specific entity instances.
Binary Relationships¶
A binary relationship involves two entity types (the most common case).
ββββββββββββ ββββββββββββ
β STUDENT βββ<ENROLLS>βββ COURSE β
ββββββββββββ ββββββββββββ
Relationship instances:
(Alice, CS101), (Alice, CS301), (Bob, CS101), ...
Ternary Relationships¶
A ternary relationship involves three entity types.
ββββββββββββ
β SUPPLIER β
ββββββββββββ
β
β
β SUPPLIES β
/ \
β β
ββββββββββββ ββββββββββββ
β PART β β PROJECT β
ββββββββββββ ββββββββββββ
Relationship instance: (Supplier1, PartA, ProjectX)
Meaning: Supplier1 supplies PartA to ProjectX
NOTE: A ternary relationship CANNOT always be decomposed into
three binary relationships without information loss!
Recursive (Unary) Relationships¶
A recursive relationship relates an entity type to itself.
ββββββββββββ
β EMPLOYEE β
ββββββ¬ββββββ
β β
β β
(supervisor)
β β
ββββββ
<SUPERVISES>
Relationship instance: (Manager_Alice, Employee_Bob)
Meaning: Alice supervises Bob
Role names are important:
EMPLOYEE (as supervisor) ββ<SUPERVISES>ββ EMPLOYEE (as supervisee)
Relationship Attributes¶
Relationships can have their own attributes:
ββββββββββββ ββββββββββββ
β STUDENT βββββ<ENROLLS_IN>βββββ COURSE β
ββββββββββββ β ββββββββββββ
(grade)
(semester)
The grade and semester belong to the RELATIONSHIP, not to either entity.
A student has a grade for a specific course, not in general.
Degree of a Relationship¶
The degree of a relationship type is the number of participating entity types.
Degree 1: Unary (recursive) EMPLOYEE supervises EMPLOYEE
Degree 2: Binary STUDENT enrolls in COURSE
Degree 3: Ternary SUPPLIER supplies PART to PROJECT
Degree n: n-ary (rare) Typically decomposed into binaries
5. Cardinality Constraints¶
Cardinality constraints specify the number of relationship instances an entity can participate in. For binary relationships, the three fundamental ratios are 1:1, 1:N, and M:N.
One-to-One (1:1)¶
Each entity in A is associated with at most one entity in B, and vice versa.
ββββββββββββ 1 1 ββββββββββββ
β EMPLOYEE βββββ<MANAGES>βββββ DEPARTMENTβ
ββββββββββββ ββββββββββββ
Each employee manages at most one department.
Each department is managed by at most one employee.
Instance:
Alice ββββ CS Department
Bob ββββ EE Department
Carol ββββ (no department managed)
Dave ββββ ME Department
Mapping:
A: Alice ββββΊ CS
B: Bob ββββΊ EE
D: Dave ββββΊ ME
One-to-Many (1:N)¶
Each entity in A can be associated with many entities in B, but each entity in B is associated with at most one entity in A.
ββββββββββββ 1 N ββββββββββββ
βDEPARTMENTβββββ<HAS>βββββ EMPLOYEE β
ββββββββββββ ββββββββββββ
A department has many employees.
An employee belongs to at most one department.
Instance:
CS βββββ¬ββββ Alice
βββββ Bob
βββββ Eve
EE βββββββ Carol
ME βββββββ Dave
Many-to-Many (M:N)¶
Each entity in A can be associated with many entities in B, and each entity in B can be associated with many entities in A.
ββββββββββββ M N ββββββββββββ
β STUDENT βββββ<ENROLLS>βββββ COURSE β
ββββββββββββ ββββββββββββ
A student can enroll in many courses.
A course can have many students.
Instance:
Alice βββ¬ββ CS101
βββ CS301
βββ MA101
Bob βββ¬ββ CS101
βββ CS301
Carol βββ¬ββ EE201
βββ CS101
Cardinality in ER Diagrams¶
There are two main conventions:
Convention 1: Chen's notation (labels on lines)
ββββββββββββ 1 ββββββββββββ N ββββββββββββ
βDEPARTMENTβββββββββ<WORKS_IN>βββββββββ EMPLOYEE β
ββββββββββββ ββββββββββββ
Convention 2: (min,max) notation (more precise)
ββββββββββββ (1,1) ββββββββββββββ (1,N) ββββββββββββ
β EMPLOYEE βββββββββββ<WORKS_IN>βββββββββββDEPARTMENTβ
ββββββββββββ ββββββββββββ
Reading:
An employee works in (1,1) department = exactly one department
A department has (1,N) employees = one or more employees
Cardinality Constraint Summary¶
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Cardinality Ratios β
β β
β 1:1 ββ Each A maps to at most 1 B; each B to at most 1 A β
β Example: Employee MANAGES Department β
β β
β 1:N ββ Each A maps to many B; each B to at most 1 A β
β Example: Department HAS Employees β
β β
β M:N ββ Each A maps to many B; each B maps to many A β
β Example: Student ENROLLS IN Course β
β β
β (min,max) notation: β
β (0,1) ββ optional, at most one β
β (1,1) ββ mandatory, exactly one β
β (0,N) ββ optional, unbounded many β
β (1,N) ββ mandatory, at least one β
β (3,5) ββ minimum 3, maximum 5 (specific bounds) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
6. Participation Constraints¶
Participation constraints specify whether every entity must participate in a relationship or if participation is optional.
Total Participation (Mandatory)¶
Every entity in the entity set must participate in at least one relationship instance. Shown with a double line (===).
ββββββββββββ ββββββββββββββββ ββββββββββββ
β EMPLOYEE ββββββββββββ<WORKS_IN>βββββββββββDEPARTMENTβ
ββββββββββββ ββββββββββββ
Every employee MUST work in some department.
(No employee can exist without a department.)
Partial Participation (Optional)¶
An entity may or may not participate in the relationship. Shown with a single line (---).
ββββββββββββ ββββββββββββββββ ββββββββββββ
β EMPLOYEE ββββββββββββ<MANAGES>ββββββββββββDEPARTMENTβ
ββββββββββββ ββββββββββββ
Not every employee manages a department (partial on EMPLOYEE side).
Every department MUST be managed by someone (total on DEPARTMENT side).
Combining Cardinality and Participation¶
Example: University ER Diagram (fragment)
ββββββββββββ (1,1) ββββββββββββββββ (1,N) ββββββββββββ
β EMPLOYEE ββββββββββ<WORKS_IN>βββββββββββββDEPARTMENTβ
ββββββββββββ ββββββββββββ
Reading the (min,max):
Employee side: (1,1) β total participation, exactly one department
Department side: (1,N) β total participation, at least one employee
ββββββββββββ (0,N) ββββββββββββββββ (0,N) ββββββββββββ
β STUDENT ββββββββββ<ENROLLS_IN>βββββββββββ COURSE β
ββββββββββββ ββββββββββββ
Reading the (min,max):
Student side: (0,N) β partial (student may not be enrolled), many courses
Course side: (0,N) β partial (course may have no students), many students
Existence Dependency¶
When an entity's existence depends on its relationship with another entity, it has total participation in that relationship.
Example:
A DEPENDENT (family member) cannot exist without an EMPLOYEE.
Therefore, DEPENDENT has total participation in the
HAS_DEPENDENT relationship.
ββββββββββββ βββββββββββββββββββ βββββββββββββ
β EMPLOYEE ββββββββββββ<HAS_DEPENDENT>βββββββββββββ DEPENDENT β
ββββββββββββ βββββββββββββ
DEPENDENT is also a weak entity (discussed next).
7. Weak Entities¶
A weak entity type is an entity type that cannot be uniquely identified by its own attributes alone. It depends on a related owner (or identifying) entity type.
Characteristics of Weak Entities¶
1. No primary key of its own
2. Has a PARTIAL KEY (discriminator) that distinguishes
weak entities related to the same owner entity
3. Always has TOTAL PARTICIPATION in the identifying relationship
4. Existence-dependent on the owner entity
Notation¶
ββββββββββββ βββββββββββββββββ
β OWNER βββ<IDENTIFIES>βββ WEAK ENTITY β
β (strong) β β β
ββββββββββββ βββββββββββββββββ
|
(partial_key)
[dashed underline]
Double rectangle: weak entity type
Double diamond: identifying relationship type
Dashed underline: partial key (discriminator)
Example: Employee and Dependent¶
ββββββββββββ βββββββββββββββ
β EMPLOYEE βββ<HAS_DEPENDENT>βββ DEPENDENT β
ββββββββββββ 1:N βββββββββββββββ
| | | |
(emp_id) (dep_name) (birth) (relationship)
[PK] [partial key]
EMPLOYEE has a primary key: emp_id
DEPENDENT has a partial key: dep_name
Full identification of a DEPENDENT:
(owner's emp_id, dep_name)
Example:
Employee E001 (Alice) has dependents:
(E001, "Tom") β Alice's son Tom
(E001, "Sue") β Alice's daughter Sue
Employee E002 (Bob) has dependents:
(E002, "Tom") β Bob's son Tom (different person from E001's Tom!)
Without the owner's key, "Tom" alone is ambiguous.
Weak Entity vs. Strong Entity¶
ββββββββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββ
β Strong Entity β Weak Entity β
ββββββββββββββββββββββββββββββββΌβββββββββββββββββββββββββββββββββββ€
β Has its own primary key β Has only a partial key β
β Can exist independently β Existence depends on owner β
β Single rectangle β Double rectangle β
β Partial participation OK β Total participation required β
β Example: EMPLOYEE, COURSE β Example: DEPENDENT, ROOM β
ββββββββββββββββββββββββββββββββ΄βββββββββββββββββββββββββββββββββββ
More examples of weak entities:
BUILDING (strong) β ROOM (weak): room_number is partial key
INVOICE (strong) β LINE_ITEM (weak): line_number is partial key
COURSE (strong) β SECTION (weak): section_number is partial key
8. Enhanced ER (EER) Model¶
The Enhanced ER (EER) model extends the basic ER model with additional concepts borrowed from object-oriented modeling: specialization, generalization, and inheritance.
Specialization¶
Specialization is a top-down process of defining subclasses of an entity type based on distinguishing characteristics.
ββββββββββββ
β PERSON β
ββββββ¬ββββββ
β
/ \
/ d \ d = disjoint
/ \ o = overlapping
/ \
ββββββββββββ ββββββββββββ
β STUDENT β β EMPLOYEE β
ββββββββββββ ββββββββββββ
PERSON is the SUPERCLASS
STUDENT and EMPLOYEE are SUBCLASSES
The circle with d/o specifies the constraint
Generalization¶
Generalization is a bottom-up process of abstracting common features from multiple entity types into a higher-level (general) entity type.
Generalization example:
We observe that CAR and TRUCK both have:
- vehicle_id, make, model, year, color
So we generalize:
ββββββββββββ
β VEHICLE β β generalized superclass
ββββββ¬ββββββ
β
/ \
/ d \
/ \
ββββββββββββ ββββββββββββ
β CAR β β TRUCK β
ββββββββββββ ββββββββββββ
(num_doors) (payload_capacity)
(trunk_size) (num_axles)
Specialization/Generalization Constraints¶
Two orthogonal constraints govern specialization:
Constraint 1: Disjointness
Disjoint (d): An entity can belong to AT MOST ONE subclass
Example: A vehicle is either a CAR or TRUCK, not both
Overlapping (o): An entity can belong to MULTIPLE subclasses
Example: A person can be both a STUDENT and an EMPLOYEE
Constraint 2: Completeness
Total: Every superclass entity MUST belong to at least one subclass
Double line from superclass to specialization circle
Example: Every VEHICLE must be either a CAR or a TRUCK
Partial: A superclass entity MAY not belong to any subclass
Single line from superclass to specialization circle
Example: A PERSON may be neither a STUDENT nor an EMPLOYEE
Four Combinations¶
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Specialization Constraint Combinations β
β β
β {disjoint, total}: Every entity in exactly one subclass β
β Example: VEHICLE β CAR xor TRUCK β
β β
β {disjoint, partial}: Entity in at most one subclass β
β Example: ACCOUNT β SAVINGS xor CHECKING β
β (some accounts may be neither) β
β β
β {overlapping, total}: Entity in one or more subclasses β
β Example: PERSON β STUDENT and/or β
β EMPLOYEE (but must be at least one) β
β β
β {overlapping, partial}: Entity in zero or more subclasses β
β Example: PERSON β STUDENT and/or β
β EMPLOYEE (can be neither) β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Attribute Inheritance¶
Subclasses inherit all attributes of their superclass and can have additional attributes specific to the subclass.
ββββββββββββββββββββ
β PERSON β
ββββββββββββββββββββ
β person_id (PK) β
β name β
β date_of_birth β
β email β
ββββββββββ¬ββββββββββ
β
/ \
/ o \ (overlapping, partial)
/ \
ββββββββββββββββββ ββββββββββββββββββ
β STUDENT β β EMPLOYEE β
ββββββββββββββββββ ββββββββββββββββββ
β + student_id β β + emp_id β
β + year β β + salary β
β + gpa β β + hire_date β
β + major β β + department β
ββββββββββββββββββ ββββββββββββββββββ
A STUDENT inherits: person_id, name, date_of_birth, email
and adds: student_id, year, gpa, major
A PERSON who is both STUDENT and EMPLOYEE has ALL attributes.
Multiple Inheritance and Category (Union Type)¶
A category (or union type) is a subclass with multiple possible superclasses:
ββββββββββββ ββββββββββββ ββββββββββββ
β PERSON β β COMPANY β β BANK β
ββββββ¬ββββββ ββββββ¬ββββββ ββββββ¬ββββββ
β β β
βββββββββββββββββββΌβββββββββββββββββββ
β
(U) β Union / Category
β
ββββββββββββββββ
β OWNER β (of a vehicle)
ββββββββββββββββ
An OWNER can be EITHER a PERSON, a COMPANY, OR a BANK.
(As opposed to specialization, where subclasses share ONE superclass.)
9. ER-to-Relational Mapping Algorithm¶
This section presents the systematic 7-step algorithm for converting an ER/EER diagram into a relational schema.
Step 1: Map Strong Entity Types¶
For each strong (regular) entity type E, create a relation R that includes all simple attributes of E. Choose a primary key for R.
ER:
ββββββββββββ
β EMPLOYEE β
ββββββββββββ
(emp_id), (name), (salary), (hire_date)
Relational:
EMPLOYEE(emp_id, first_name, last_name, salary, hire_date)
PK: emp_id
Rules:
- Composite attributes: include only leaf components
(name β first_name, last_name)
- Derived attributes: omit (computed at query time)
- Multivalued attributes: handled in Step 6
Step 2: Map Weak Entity Types¶
For each weak entity type W with owner entity E, create a relation R that includes: - All simple attributes of W - The primary key of E as a foreign key - Primary key of R = PK of E + partial key of W
ER:
ββββββββββββ 1:N βββββββββββββ
β EMPLOYEE ββββββββββββββββ DEPENDENT β
ββββββββββββ βββββββββββββ
(emp_id) (dep_name), (birth_date), (relationship)
Relational:
DEPENDENT(emp_id, dep_name, birth_date, relationship)
PK: (emp_id, dep_name)
FK: emp_id β EMPLOYEE(emp_id) ON DELETE CASCADE
Step 3: Map Binary 1:1 Relationship Types¶
Three approaches (choose based on participation constraints):
Approach A: Foreign Key Approach (preferred)
Add the PK of one entity as a FK in the other.
Prefer to add FK on the side with TOTAL participation.
ER: EMPLOYEE (0,1) ββ<MANAGES>ββ (1,1) DEPARTMENT
Relational:
EMPLOYEE(emp_id, name, salary)
DEPARTMENT(dept_id, dept_name, mgr_emp_id, mgr_start_date)
^^^^^^^^^
FK β EMPLOYEE(emp_id)
(FK on DEPARTMENT because it has total participation:
every department must have a manager)
Approach B: Merged Relation
Merge both entity types into one relation.
Only feasible when BOTH sides have total participation.
Approach C: Cross-Reference (Relationship Relation)
Create a separate relation for the relationship.
Useful when the relationship has many attributes.
Step 4: Map Binary 1:N Relationship Types¶
Add the PK of the "1-side" entity as a FK in the "N-side" entity.
ER: DEPARTMENT (1) ββ<HAS>ββ (N) EMPLOYEE
Relational:
DEPARTMENT(dept_id, dept_name, budget)
EMPLOYEE(emp_id, name, salary, dept_id)
^^^^^^^
FK β DEPARTMENT(dept_id)
Any relationship attributes go with the N-side entity:
If HAS has (start_date), add it to EMPLOYEE.
Step 5: Map Binary M:N Relationship Types¶
Create a new relationship relation R. Include: - PKs of both participating entity types as FKs - Any attributes of the relationship - PK of R = combination of both FKs
ER: STUDENT (M) ββ<ENROLLS>ββ (N) COURSE
With attributes: grade, semester
Relational:
STUDENT(student_id, name, year)
COURSE(course_id, title, credits)
ENROLLMENT(student_id, course_id, semester, grade)
PK: (student_id, course_id, semester)
FK: student_id β STUDENT(student_id)
course_id β COURSE(course_id)
Step 6: Map Multivalued Attributes¶
Create a new relation for each multivalued attribute. Include: - The multivalued attribute - The PK of the entity as a FK - PK = FK + multivalued attribute
ER: EMPLOYEE has multivalued attribute {phone_numbers}
Relational:
EMPLOYEE(emp_id, name, salary)
EMPLOYEE_PHONE(emp_id, phone_number)
PK: (emp_id, phone_number)
FK: emp_id β EMPLOYEE(emp_id) ON DELETE CASCADE
Step 7: Map Specialization/Generalization¶
Four options exist. The best choice depends on the constraints:
Option A: Single Table with Type Discriminator
PERSON(person_id, name, dob, email, person_type,
-- STUDENT attributes (NULL if not a student)
student_id, year, gpa, major,
-- EMPLOYEE attributes (NULL if not an employee)
emp_id, salary, hire_date, department)
person_type IN ('S', 'E', 'SE', 'N') -- Student, Employee, Both, Neither
Pros: Simple, no joins needed
Cons: Many NULLs, harder to enforce subclass constraints
Best for: Few subclasses, many queries across all types
Option B: Separate Tables for Each Subclass (Superclass PKs Inherited)
PERSON(person_id, name, dob, email)
STUDENT(person_id, student_id, year, gpa, major)
FK: person_id β PERSON(person_id)
EMPLOYEE(person_id, emp_id, salary, hire_date, department)
FK: person_id β PERSON(person_id)
Pros: No NULLs, clean separation
Cons: Requires joins to get full data
Best for: Many subclass-specific attributes, overlapping allowed
Option C: Separate Tables (Full Attributes in Each)
STUDENT(person_id, name, dob, email, student_id, year, gpa, major)
EMPLOYEE(person_id, name, dob, email, emp_id, salary, hire_date, dept)
Pros: No joins needed for subclass queries
Cons: Redundancy (shared attributes duplicated), hard to query
across all persons, overlapping requires data duplication
Best for: Disjoint, total specialization
Option D: Hybrid (superclass + specialized tables)
Choose based on usage patterns:
- Frequently queried together β Option A
- Mostly queried separately β Option C
- Need flexibility β Option B
Mapping Decision Table¶
ββββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββ
β ER Construct β Relational Mapping β
ββββββββββββββββββββββββββββΌββββββββββββββββββββββββββββββββββββ€
β Strong entity β New relation, own PK β
β Weak entity β New relation, composite PK β
β 1:1 relationship β FK in one entity (total side) β
β 1:N relationship β FK in N-side entity β
β M:N relationship β New relation (bridge table) β
β Multivalued attribute β New relation β
β Composite attribute β Flatten to components β
β Derived attribute β Omit (compute at query time) β
β Specialization/general. β Options A, B, C, or D β
β Ternary relationship β New relation with 3 FKs β
β Recursive relationship β FK to own table (or bridge table) β
ββββββββββββββββββββββββββββ΄ββββββββββββββββββββββββββββββββββββ
10. Design Case Study: University Database¶
Let us design a complete ER diagram for a university database and map it to a relational schema.
Requirements¶
1. The university has DEPARTMENTS, each with a name, building, and budget.
2. Each department has one CHAIRPERSON (a faculty member).
3. FACULTY members have an ID, name, rank, and salary. Each belongs to
one department.
4. STUDENTS have an ID, name, year, and GPA. Each has a major department.
5. COURSES have an ID, title, credits, and belong to a department.
6. Faculty members TEACH courses. Each offering is in a specific semester.
A course can have multiple sections taught by different faculty.
7. Students ENROLL in course sections and receive grades.
8. Students may have multiple PHONE NUMBERS and EMAIL ADDRESSES.
9. Faculty can ADVISE students (a student has one advisor).
ER Diagram (ASCII)¶
{phone} {email}
\ /
\ /
ββββββββββββ (0,1) (1,1) ββββββββββββ
β STUDENT ββββββ<ADVISES>ββββββ FACULTY β
ββββββββββββ ββββββββββββ
(sid)(name) (fid)(name)
(year)(gpa) (rank)(salary)
| |
(1,N)| (1,N)|
| |
<ENROLLED_IN> <TEACHES>
| |
(1,N)| (1,N)|
| |
ββββββββββββββββ ββββββββββββββββ
β SECTION β β SECTION β
β (sec_number) β β β
ββββββββββββββββ ββββββββββββββββ
| |
(1,1)| |
| |
ββββββββββββ (1,N) (1,1) ββββββββββββ (1,1) ββββββββββββ
β COURSE ββββββ<OFFERED_BY>βββββββDEPARTMENTβββββ<CHAIRS>ββββββ FACULTY β
ββββββββββββ ββββββββββββ (already
(cid)(title) (did)(name) shown)
(credits) (building)
(budget)
Note: The above is simplified. A more accurate representation would model SECTION as a weak entity of COURSE, with TEACHES linking FACULTY to SECTION and ENROLLED_IN linking STUDENT to SECTION.
Refined ER Design¶
Entities:
DEPARTMENT(dept_id, dept_name, building, budget) Strong
FACULTY(fac_id, name, rank, salary) Strong
STUDENT(stu_id, name, year, gpa, {phone}, {email}) Strong + MV
COURSE(course_id, title, credits) Strong
SECTION(sec_number, semester, year) Weak (owner: COURSE)
Relationships:
WORKS_IN: FACULTY (N,1) --- DEPARTMENT (1:N)
MAJOR_IN: STUDENT (N,1) --- DEPARTMENT (1:N)
CHAIRS: FACULTY (1,0..1) --- DEPARTMENT (1:1)
OFFERS: DEPARTMENT (1,N) --- COURSE (1:N)
HAS_SECTION: COURSE (1,N) === SECTION (identifying, 1:N)
TEACHES: FACULTY (1,N) --- SECTION (1:1 per section)
ENROLLED_IN: STUDENT (M) --- SECTION (N) (M:N, attrs: grade)
ADVISES: FACULTY (1) --- STUDENT (N) (1:N)
Relational Schema (Mapped)¶
-- Step 1: Strong entities
CREATE TABLE department (
dept_id CHAR(4) PRIMARY KEY,
dept_name VARCHAR(50) NOT NULL UNIQUE,
building VARCHAR(30),
budget NUMERIC(12,2) CHECK (budget >= 0)
);
CREATE TABLE faculty (
fac_id CHAR(5) PRIMARY KEY,
name VARCHAR(50) NOT NULL,
rank VARCHAR(20) CHECK (rank IN
('Lecturer','Assistant','Associate','Full')),
salary NUMERIC(10,2) CHECK (salary > 0),
dept_id CHAR(4) NOT NULL, -- Step 4: 1:N WORKS_IN
FOREIGN KEY (dept_id) REFERENCES department(dept_id)
);
CREATE TABLE student (
stu_id CHAR(5) PRIMARY KEY,
name VARCHAR(50) NOT NULL,
year SMALLINT CHECK (year BETWEEN 1 AND 4),
gpa NUMERIC(3,2) CHECK (gpa >= 0.0 AND gpa <= 4.0),
major_id CHAR(4), -- Step 4: 1:N MAJOR_IN
advisor_id CHAR(5), -- Step 4: 1:N ADVISES
FOREIGN KEY (major_id) REFERENCES department(dept_id),
FOREIGN KEY (advisor_id) REFERENCES faculty(fac_id)
);
CREATE TABLE course (
course_id CHAR(6) PRIMARY KEY,
title VARCHAR(100) NOT NULL,
credits SMALLINT NOT NULL CHECK (credits BETWEEN 1 AND 5),
dept_id CHAR(4) NOT NULL, -- Step 4: 1:N OFFERS
FOREIGN KEY (dept_id) REFERENCES department(dept_id)
);
-- Step 2: Weak entity (SECTION identified by COURSE)
CREATE TABLE section (
course_id CHAR(6) NOT NULL,
sec_number SMALLINT NOT NULL,
semester VARCHAR(10) NOT NULL,
sec_year SMALLINT NOT NULL,
fac_id CHAR(5), -- Step 4: 1:N TEACHES
PRIMARY KEY (course_id, sec_number, semester, sec_year),
FOREIGN KEY (course_id) REFERENCES course(course_id)
ON DELETE CASCADE,
FOREIGN KEY (fac_id) REFERENCES faculty(fac_id)
);
-- Step 3: 1:1 CHAIRS (FK on department side, total participation)
ALTER TABLE department
ADD COLUMN chair_fac_id CHAR(5),
ADD CONSTRAINT fk_chair
FOREIGN KEY (chair_fac_id) REFERENCES faculty(fac_id);
-- Step 5: M:N ENROLLED_IN
CREATE TABLE enrollment (
stu_id CHAR(5) NOT NULL,
course_id CHAR(6) NOT NULL,
sec_number SMALLINT NOT NULL,
semester VARCHAR(10) NOT NULL,
sec_year SMALLINT NOT NULL,
grade VARCHAR(2),
PRIMARY KEY (stu_id, course_id, sec_number, semester, sec_year),
FOREIGN KEY (stu_id) REFERENCES student(stu_id),
FOREIGN KEY (course_id, sec_number, semester, sec_year)
REFERENCES section(course_id, sec_number, semester, sec_year)
);
-- Step 6: Multivalued attributes
CREATE TABLE student_phone (
stu_id CHAR(5) NOT NULL,
phone VARCHAR(20) NOT NULL,
PRIMARY KEY (stu_id, phone),
FOREIGN KEY (stu_id) REFERENCES student(stu_id) ON DELETE CASCADE
);
CREATE TABLE student_email (
stu_id CHAR(5) NOT NULL,
email VARCHAR(100) NOT NULL,
PRIMARY KEY (stu_id, email),
FOREIGN KEY (stu_id) REFERENCES student(stu_id) ON DELETE CASCADE
);
Schema Diagram Summary¶
ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ
β DEPARTMENT β β FACULTY β β STUDENT β
ββββββββββββββββ€ ββββββββββββββββ€ ββββββββββββββββ€
βPK dept_id βββββββPK fac_id βββββββPK stu_id β
β dept_name β FK β name β FK β name β
β building βdept β rank βadv β year β
β budget β β salary β β gpa β
βFK chair_fac βββββββFK dept_id ββββ βFK major_idβββΊβ DEPARTMENT
ββββββββββββββββ ββββββββββββββββ βFK advisor_idβββΊFACULTY
β β ββββββββββββββββ
β β β
β βββββββ β
β β β
ββββββββββββ βββββββ΄βββββββ βββββββββββββββββ΄βββ
β COURSE β β SECTION β β ENROLLMENT β
ββββββββββββ€ ββββββββββββββ€ ββββββββββββββββββββ€
βPK crs_id ββββPK crs_id ββββPK stu_id β
β title β βPK sec_num β βPK crs_id β
β creditsβ βPK semester β βPK sec_num β
βFK dept_idβ βPK sec_year β βPK semester β
ββββββββββββ βFK fac_id β βPK sec_year β
ββββββββββββββ β grade β
ββββββββββββββββββββ
βββββββββββββββββββ βββββββββββββββββββ
β STUDENT_PHONE β β STUDENT_EMAIL β
βββββββββββββββββββ€ βββββββββββββββββββ€
βPK,FK stu_id β βPK,FK stu_id β
βPK phone β βPK email β
βββββββββββββββββββ βββββββββββββββββββ
11. Common Pitfalls and Best Practices¶
Pitfall 1: Fan Trap¶
A fan trap occurs when a path through multiple 1:N relationships fans out, causing ambiguous associations.
Problem:
DEPARTMENT β1:Nβ EMPLOYEE
DEPARTMENT β1:Nβ PROJECT
Query: "Which employees work on which projects?"
The path goes DEPARTMENT β EMPLOYEE and DEPARTMENT β PROJECT
but there is NO direct link between EMPLOYEE and PROJECT!
Solution: Add a direct WORKS_ON relationship between EMPLOYEE and PROJECT.
Pitfall 2: Chasm Trap¶
A chasm trap occurs when a path does not exist between entity types that should be related.
Problem:
DEPARTMENT β1:Nβ EMPLOYEE (partial)
EMPLOYEE β1:Nβ PROJECT
If some departments have no employees, there is a "chasm"
(gap) in the path from DEPARTMENT to PROJECT.
Solution: Add a direct HAS_PROJECT relationship between DEPARTMENT and PROJECT.
Pitfall 3: Overuse of Multivalued Attributes¶
Bad:
PERSON with {address} β if addresses need their own attributes
(street, city, state, zip), this is wrong
Better:
PERSON β1:Nβ ADDRESS
ADDRESS(address_id, street, city, state, zip_code)
Pitfall 4: Missing Relationship Attributes¶
Bad:
STUDENT has grade attribute β grade for which course?
Better:
STUDENT βM:Nβ COURSE with relationship attribute: grade
Best Practices¶
1. Start with entities, then relationships, then attributes
2. Every entity must have a key attribute
3. Avoid redundant relationships (derivable from others)
4. Use weak entities only when truly necessary
5. Prefer binary relationships over ternary when possible
6. Document all assumptions and constraints
7. Validate with stakeholders using concrete examples
8. Name entities as singular nouns (STUDENT, not STUDENTS)
9. Name relationships as verbs (ENROLLS_IN, TEACHES)
10. Use (min,max) notation for precise constraints
12. Exercises¶
Basic Concepts¶
Exercise 4.1: For each of the following, identify whether it should be modeled as an entity type, a relationship type, or an attribute. Justify your answer.
- (a) Employee name
- (b) Department
- (c) Marriage (between two persons)
- (d) Student GPA
- (e) Book ISBN
- (f) Course enrollment
- (g) Employee skill (assuming an employee can have many skills)
- (h) Project deadline
Exercise 4.2: Classify each of the following attributes:
| Attribute | Simple/Composite | Single/Multi | Stored/Derived | Key? |
|---|---|---|---|---|
| SSN | ||||
| Full name (first + middle + last) | ||||
| Phone numbers (multiple) | ||||
| Age (given date of birth) | ||||
| Email address | ||||
| Address (street, city, state, zip) |
ER Design¶
Exercise 4.3: Draw an ER diagram for a hospital system with the following requirements: - Patients have an ID, name, date of birth, and blood type - Doctors have an ID, name, specialty, and phone - Each patient is assigned to a primary doctor - Doctors can prescribe medications to patients (record date and dosage) - Medications have a code, name, and manufacturer - Patients can have multiple allergies
Specify cardinality and participation constraints.
Exercise 4.4: Draw an ER diagram for an online learning platform: - Instructors create courses. A course has a title, description, and price. - Courses contain multiple lessons in a specific order. - Students can enroll in courses and track their progress per lesson (completion status, time spent). - Students can rate and review courses (1-5 stars, text). - There are quizzes at the end of each lesson. Students attempt quizzes and receive scores.
Identify all entity types, relationship types, attributes, keys, and constraints.
Cardinality and Participation¶
Exercise 4.5: For each of the following scenarios, determine the cardinality ratio (1:1, 1:N, M:N) and participation constraints (total/partial):
- (a) A country has one capital city; a capital city belongs to one country
- (b) A student lives in one dormitory room; a room can house multiple students
- (c) An author can write many books; a book can have many authors
- (d) An employee works on multiple projects; a project has multiple employees
- (e) A person has one passport; a passport belongs to one person (not everyone has a passport)
Weak Entities¶
Exercise 4.6: For each pair, determine which (if any) should be a weak entity:
- (a) Building and Room
- (b) Invoice and LineItem
- (c) Student and Course
- (d) Bank and Branch
- (e) Order and OrderItem
For each weak entity, identify the partial key and the identifying relationship.
EER¶
Exercise 4.7: Design an EER diagram for the following:
A company has employees. Employees are specialized into managers, engineers, and secretaries. An engineer can be further specialized into software engineers and hardware engineers.
- Determine if the specializations should be disjoint or overlapping
- Determine if they should be total or partial
- Add at least two specific attributes for each subclass
- Draw the complete EER diagram
ER-to-Relational Mapping¶
Exercise 4.8: Given the following ER diagram, apply the 7-step mapping algorithm to produce a complete relational schema with SQL DDL:
{skill}
|
ββββββββββββ 1:N ββββββββββββββββ M:N ββββββββββββ
βDEPARTMENTββββββββ<WORKS_IN>ββββββββ EMPLOYEE βββββ<WORKS_ON>βββββ PROJECT β
ββββββββββββ ββββββββββββββββ |hours| ββββββββββββ
(dept_id) (emp_id)(name) (proj_id)(name)
(name) (salary) (budget)
(budget) (birth_date) (location)
|
1:N |
|
βββββββββββββββββ
β DEPENDENT β
β(dep_name) β
β(birth_date) β
β(relationship)β
βββββββββββββββββ
Include: all tables, PKs, FKs, domain constraints, and ON DELETE actions.
Exercise 4.9: Map the following specialization hierarchy to a relational schema using each of the three approaches (Option A: single table, Option B: superclass + subclass tables, Option C: separate tables). Discuss the trade-offs.
VEHICLE
(vin, make, model, year, color)
|
/ \
d,t
/ \
CAR TRUCK
(num_doors, (payload_cap,
trunk_vol) num_axles,
cab_type)
Design Challenge¶
Exercise 4.10: A local library needs a database system. Design the complete conceptual schema (ER/EER diagram) and map it to a relational schema. The requirements are:
- The library has multiple branches, each with a name, address, and phone
- Books are identified by ISBN and have a title, publication year, and edition
- Each book has one or more authors
- Books belong to one or more categories (Fiction, Science, History, etc.)
- Each branch maintains multiple copies of each book. Each copy has a copy number and condition status
- Members have a card number, name, address, and phone. Members register at a specific branch
- Members can borrow copies. Each borrowing records the borrow date, due date, and return date
- Members can reserve books at a branch (not a specific copy)
- Employees work at branches. There are librarians and assistants (disjoint specialization)
- Each branch has one librarian who serves as the branch manager
Your deliverables: - Complete ER/EER diagram with all constraints - Relational schema (apply all 7 steps) - SQL DDL for at least 5 key tables - Three sample queries that demonstrate the design supports the library's needs
Previous: Relational Algebra | Next: Functional Dependencies