Virtual Memory
Virtual Memory¶
Overview¶
Virtual memory is a core technology that provides programs with a contiguous and independent address space, and enables running programs larger than physical memory. In this lesson, we will learn about virtual memory concepts, page tables, TLB, and page fault handling.
Difficulty: ββββ
Prerequisites: Memory Hierarchy, Cache Memory
Table of Contents¶
- The Need for Virtual Memory
- Address Spaces
- Pages and Page Frames
- Page Tables
- TLB (Translation Lookaside Buffer)
- Page Faults and Page Replacement
- Page Replacement Algorithms
- Advanced Topics
- Practice Problems
1. The Need for Virtual Memory¶
1.1 Problems Before Virtual Memory¶
Problem 1: Memory Capacity Limitation
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Program A: Needs 2GB β
β Physical Memory: 1GB β
β β
β β Cannot run Program A! β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Problem 2: Memory Fragmentation
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Physical memory state: β
β ββββββββ¬βββββββ¬βββββββ¬βββββββ¬βββββββ¬βββββββ¬βββββββ β
β β Used β Gap β Used β Gap β Used β Gap β Used β β
β β 100M β 50M β 200M β 30M β 100M β 70M β 150M β β
β ββββββββ΄βββββββ΄βββββββ΄βββββββ΄βββββββ΄βββββββ΄βββββββ β
β β
β Total free space: 150MB, but no 100MB contiguous space β
β β Cannot load 100MB program (external fragmentation) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Problem 3: Lack of Memory Protection
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Program A can access Program B's memory region β
β β Security vulnerability, system instability β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Problem 4: Relocation Problem
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Program compiled to run at specific address β
β β Cannot run if that address is in use β
β β Each program needs recompilation for different address β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
1.2 Virtual Memory Solutions¶
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Virtual Memory System β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β Program A Program B Program C β
β βββββββββββ βββββββββββ βββββββββββ β
β β Virtual β β Virtual β β Virtual β β
β β Address β β Address β β Address β β
β β Space β β Space β β Space β β
β β 0~4GB β β 0~4GB β β 0~4GB β β
β ββββββ¬βββββ ββββββ¬βββββ ββββββ¬βββββ β
β β β β β
β ββββββββββββββββββΌβββββββββββββββββ β
β β β
β βββββββββββ΄ββββββββββ β
β β MMU (Memory β β
β β Management Unit) β β
β β VirtualβPhysical β β
β β Translation β β
β βββββββββββ¬ββββββββββ β
β β β
β βββββββββββ΄ββββββββββ β
β β Physical Memory β β
β β (1GB, etc.) β β
β βββββββββββ¬ββββββββββ β
β β β
β βββββββββββ΄ββββββββββ β
β β Disk (Swap) β β
β β (hundreds GB) β β
β βββββββββββββββββββββ β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Advantages of Virtual Memory:
1. Capacity Extension: Use disk as memory extension
2. Memory Protection: Each process has independent address space
3. Memory Sharing: Multiple processes can share same physical pages
4. Fragmentation Solved: Non-contiguous physical memory appears contiguous
5. Relocation Transparency: All programs can start at same virtual address
2. Address Spaces¶
2.1 Virtual Address vs Physical Address¶
Virtual Address:
- Address used by programs
- Address generated by CPU
- Independent space for each process
Physical Address:
- Actual memory hardware address
- Address accessing DRAM chips
- Unique across entire system
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β
β Process A Process B Physical Mem β
β Virtual Addr Space Virtual Addr Space β
β β
β 0x00000000 0x00000000 0x00000000 β
β βββββββββββββ βββββββββββββ βββββββββββββ β
β β Code βββββββΆβ Code ββββ β Frame 0 β β
β βββββββββββββ€ βββββββββββββ€ β βββββββββββββ€ β
β β Data βββββ β Data ββββΌβββΆβ Frame 1 β β
β βββββββββββββ€ β βββββββββββββ€ β βββββββββββββ€ β
β β Heap β βββΆβ Heap ββββΌβββΆβ Frame 2 β β
β βββββββββββββ€ βββββββββββββ€ β βββββββββββββ€ β
β β β β β β β ββββΆβ Frame 3 β β
β β β β β βββββββββββββ€ β
β β β β β β β β Frame 4 β β
β βββββββββββββ€ βββββββββββββ€ βββββββββββββ€ β
β β Stack ββββββββ Stack βββββββΆβ Frame 5 β β
β βββββββββββββ βββββββββββββ βββββββββββββ β
β 0xFFFFFFFF 0xFFFFFFFF (actual size) β
β β
β Same virtual address 0x1000 can map to different physicalβ
β addresses β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
2.2 Address Space Size¶
32-bit System:
- Virtual address space: 2^32 = 4GB
- Physical memory: Actual installed capacity (e.g., 4GB)
64-bit System:
- Theoretical virtual space: 2^64 = 16 exabytes
- Actual implementation (x86-64):
- 48 bits used: 256TB virtual space
- Physical address: Up to 52 bits supported (4PB)
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β x86-64 Virtual Address Space Layout (48-bit) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β 0xFFFF_FFFF_FFFF_FFFF βββββββββββββββββββββ β
β β Kernel Space β β
β β (upper half) β β
β 0xFFFF_8000_0000_0000 βββββββββββββββββββββ€ β
β β β β
β β Non-Canonical β β Not usable β
β β (hole) β β
β β β β
β 0x0000_7FFF_FFFF_FFFF βββββββββββββββββββββ€ β
β β User Space β β
β β (lower half) β β
β 0x0000_0000_0000_0000 βββββββββββββββββββββ β
β β
β 128TB available for each half β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
3. Pages and Page Frames¶
3.1 Paging Concept¶
Page:
- Fixed-size unit dividing virtual address space
- Typically 4KB (4096 bytes)
- Management unit of virtual memory
Page Frame (Frame):
- Same-sized unit dividing physical memory
- Same size as page
- Management unit of physical memory
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β
β Virtual Address Space Physical Memory β
β β
β βββββββββββββββ βββββββββββββββ β
β β Page 0 ββββββββββββΆβ Frame 3 β β
β βββββββββββββββ€ βββββββββββββββ€ β
β β Page 1 ββββββββ β Frame 0 β β
β βββββββββββββββ€ β βββββββββββββββ€ β
β β Page 2 β βββββΆβ Frame 1 β β
β βββββββββββββββ€ βββββββββββββββ€ β
β β Page 3 ββββββββββββΆβ Frame 4 β β
β βββββββββββββββ€ βββββββββββββββ€ β
β β Page 4 ββ (Disk) β Frame 2 βββOther processβ
β βββββββββββββββ€ βββββββββββββββ€ β
β β Page 5 ββββββββββββΆβ Frame 5 β β
β βββββββββββββββ βββββββββββββββ β
β β
β - Page and Frame same size (4KB) β
β - Mapping is arbitrary (non-contiguous possible) β
β - Some pages may be on disk β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
3.2 Page Size¶
Common page sizes:
- 4KB (2^12): Most common
- 2MB (2^21): Large Page (Huge Page)
- 1GB (2^30): Gigantic Page
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Page Size Trade-offs β
ββββββββββββββββββ¬ββββββββββββββββββββ¬βββββββββββββββββββββββββ€
β β Small Page (4KB) β Large Page (2MB) β
ββββββββββββββββββΌββββββββββββββββββββΌβββββββββββββββββββββββββ€
β Internal Frag. β Low β High β
ββββββββββββββββββΌββββββββββββββββββββΌβββββββββββββββββββββββββ€
β Page Table β Large β Small β
ββββββββββββββββββΌββββββββββββββββββββΌβββββββββββββββββββββββββ€
β TLB Coverage β Small β Large β
ββββββββββββββββββΌββββββββββββββββββββΌβββββββββββββββββββββββββ€
β Disk Transfer β Inefficient β Efficient β
ββββββββββββββββββΌββββββββββββββββββββΌβββββββββββββββββββββββββ€
β Suitable For β General programs β Large memory apps β
ββββββββββββββββββ΄ββββββββββββββββββββ΄βββββββββββββββββββββββββ
Internal fragmentation example:
- Process needs 4097 bytes
- 4KB page: 2 pages needed (8KB), ~4KB wasted
- 2MB page: 1 page needed (2MB), ~2MB wasted
3.3 Virtual Address Decomposition¶
4KB page, 32-bit virtual address:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β 31 12 11 0 β
β ββββββββββββββββββββββββββββββββββΌβββββββββββββββββββββββββ€β
β β Virtual Page Number (VPN) β Page Offset ββ
β β (20 bits) β (12 bits) ββ
β ββββββββββββββββββββββββββββββββββ΄ββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Page Offset: 12 bits = Position within 4KB page (0-4095)
VPN: 20 bits = 2^20 = ~1 million virtual pages
Example: Virtual address 0x12345678
- VPN = 0x12345 (upper 20 bits)
- Offset = 0x678 (lower 12 bits)
Physical address after translation:
- Page table lookup: VPN 0x12345 β PFN 0x00ABC
- Physical address = 0x00ABC000 + 0x678 = 0x00ABC678
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β
β Virtual addr: 0x12345 β 0x678 β
β (VPN) β (Offset) β
β β β
β [Page Table] β
β β β
β Physical addr: 0x00ABC β 0x678 β
β (PFN) β (Offset) β Copied directly β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
4. Page Tables¶
4.1 Single-Level Page Table¶
Structure:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Page Table β
ββββββββ¬ββββββββ¬ββββββββ¬ββββββββ¬βββββββββββββββββββββββββββββββ€
βIndex β Valid β PFN β Prot β Other Flags β
ββββββββΌββββββββΌββββββββΌββββββββΌβββββββββββββββββββββββββββββββ€
β 0 β 1 β 0x003 β RW β Present, Dirty β
β 1 β 1 β 0x001 β R β Present β
β 2 β 0 β - β - β Not Present (Disk) β
β 3 β 1 β 0x004 β RW β Present β
β 4 β 1 β 0x002 β RWX β Present β
β ... β ... β ... β ... β ... β
ββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄βββββββββββββββββββββββββββββββ
Page Table Entry (PTE) Structure (x86, 32-bit):
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β 31 12 11 9 8 7 6 5 4 3 2 1 0β
β ββββββββββββββββββΌβββββββββΌβββββΌββββΌββββΌββββΌββββΌββββΌβββΌβββΌβββ€β
β β Page Frame βReservedβ G βPATβ D β A βPCDβPWTβU βW βP ββ
β β Number β β β β β β β β/Sβ/Rβ ββ
β ββββββββββββββββββ΄βββββββββ΄βββββ΄ββββ΄ββββ΄ββββ΄ββββ΄ββββ΄βββ΄βββ΄ββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
P (Present): Page is in physical memory
W/R: Writable (1) / Read-only (0)
U/S: User accessible (1) / Kernel only (0)
A (Accessed): Has been accessed (read or write)
D (Dirty): Has been modified (written)
Problem: Memory waste
- 32-bit address, 4KB page: 2^20 = 1M entries
- 4 bytes per entry: 4MB page table per process
- Full table needed even if most space unused
4.2 Multi-Level Page Table¶
2-Level Page Table (32-bit, 4KB page):
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β 31 22 21 12 11 0 β
β ββββββββββββββββββββΌβββββββββββββββββββΌβββββββββββββββββββ€ β
β β Page Dir Index β Page Table Index β Page Offset β β
β β (10 bits) β (10 bits) β (12 bits) β β
β ββββββββββββββββββββ΄βββββββββββββββββββ΄βββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Translation Process:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β
β Virtual Address: [Dir Index | Table Index | Offset] β
β β β β β
β βΌ β β β
β βββββββββββββββββββββββββββββββ β β β
β β Page Directory (4KB) β β β β
β β 1024 entries Γ 4B β β β β
β β βββββββββββββββββββββββ β β β β
β β β Entry 0 β PT0 β β β β β
β β β Entry 1 β PT1 β β β β β
β β β Entry 2 β null ββββ Unused regions have no tableβ
β β β ... β β β β β
β β β Entry N β PTN ββββββΌββββ β β
β β βββββββββββββββββββββββ β β β
β βββββββββββββββββββββββββββββββ β β
β β β β
β βΌ β β
β βββββββββββββββββββββββββββββββ β β
β β Page Table N (4KB) β β β
β β 1024 entries Γ 4B β β β
β β βββββββββββββββββββββββ β β β
β β β Entry 0 β Frame X β β β β
β β β Entry M β Frame Y ββββββΌβββββββββββββββ β
β β β ... β β β
β β βββββββββββββββββββββββ β β β
β βββββββββββββββββββββββββββββββ β β
β β β β
β βΌ β β
β Physical Frame Y + Offset = Physical Address β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Advantage:
- No page table allocation for unused regions
- Large savings if most virtual space is empty
4.3 4-Level Page Table (x86-64)¶
4-Level Page Table for 64-bit systems:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β 63 48 47 39 38 30 29 21 20 12 11 0 β
β ββββββββΌβββββββββΌβββββββββΌβββββββββΌβββββββββΌβββββββββββββ€ β
β β Sign β PML4 β PDPT β PD β PT β Offset β β
β β Ext β(9bits) β(9bits) β(9bits) β(9bits) β (12bits) β β
β ββββββββ΄βββββββββ΄βββββββββ΄βββββββββ΄βββββββββ΄βββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
PML4: Page Map Level 4
PDPT: Page Directory Pointer Table
PD: Page Directory
PT: Page Table
Translation Process:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β
β CR3 Register β
β β β
β βΌ β
β βββββββββββ βββββββββββ βββββββββββ β
β β PML4 βββββββΆβ PDPT βββββββΆβ PD β β
β β (512E) β β (512E) β β (512E) β β
β βββββββββββ βββββββββββ βββββββββββ β
β β β
β βΌ β
β βββββββββββ β
β β PT β β
β β (512E) β β
β ββββββ¬βββββ β
β β β
β βΌ β
β βββββββββββββββββββ β
β β Physical Frame β β
β β + Offset β β
β βββββββββββββββββββ β
β β
β Each level: 512 entries Γ 8 bytes = 4KB β
β Maximum 4 memory accesses (on TLB miss) β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
4.4 Inverted Page Table¶
Traditional Page Table:
- Entries for all virtual pages
- Separate table per process
Inverted Page Table:
- Entries for physical frames only
- One table for entire system
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Inverted Page Table β
βββββββββ¬βββββββββ¬βββββββββββββββ¬ββββββββββββββββββββββββββββββ€
β Frame β PID β VPN β Chain β
βββββββββΌβββββββββΌβββββββββββββββΌββββββββββββββββββββββββββββββ€
β 0 β 42 β 0x12345 β null β
β 1 β 17 β 0x00ABC β β 5 β
β 2 β 42 β 0x67890 β null β
β 3 β 25 β 0x11111 β null β
β 4 β 17 β 0x00DEF β null β
β 5 β 33 β 0x00ABC β null (hash chain) β
β ... β ... β ... β ... β
βββββββββ΄βββββββββ΄βββββββββββββββ΄ββββββββββββββββββββββββββββββ
Search: Hash (PID, VPN) and search table
- Follow chain on hash collision
- Advantage: Memory savings (proportional to physical memory)
- Disadvantage: Search time (hash + chain)
5. TLB (Translation Lookaside Buffer)¶
5.1 Need for TLB¶
Page table access overhead:
- 4-level page table: 4 additional memory accesses
- 4 table lookups before actual data access
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Memory access without TLB: β
β β
β 1. Access PML4 (memory) β
β 2. Access PDPT (memory) β
β 3. Access PD (memory) β
β 4. Access PT (memory) β
β 5. Access actual data (memory) β
β β
β Total 5 memory accesses! (400+ cycles) β
β β
β With TLB hit: β
β 1. TLB lookup (~1 cycle) β
β 2. Access actual data (memory) β
β β
β Total ~100 cycles (TLB lookup + memory) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
5.2 TLB Structure¶
TLB (Translation Lookaside Buffer):
- Caches recent translation results
- Fully associative or set associative structure
- Very small (32-1024 entries)
- Very fast (~1 cycle)
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β TLB β
ββββββββ¬ββββββββ¬βββββββββββ¬βββββββββββ¬βββββββββββββββββββββββββ€
β ASID β Valid β VPN β PFN β Permissions β
ββββββββΌββββββββΌβββββββββββΌβββββββββββΌβββββββββββββββββββββββββ€
β 42 β 1 β 0x12345 β 0x00ABC β R/W, User β
β 42 β 1 β 0x12346 β 0x00ABD β R/W, User β
β 17 β 1 β 0x00100 β 0x00300 β R, User β
β 42 β 1 β 0x7FFFF β 0x00123 β R/W, User β
β 17 β 1 β 0x00200 β 0x00400 β R/W/X, User β
β ... β ... β ... β ... β ... β
ββββββββ΄ββββββββ΄βββββββββββ΄βββββββββββ΄βββββββββββββββββββββββββ
ASID (Address Space ID):
- Distinguishes processes
- Avoids TLB flush on context switch
5.3 TLB Operation¶
Address Translation Process:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β
β CPU generates virtual address β
β β β
β βΌ β
β βββββββββββββββββββ β
β β TLB Lookup β β Done in parallel (VPN extract & TLB) β
β ββββββββββ¬βββββββββ β
β β β
β βββββββ΄ββββββ β
β β β β
β Hit? Miss β
β β β β
β βΌ βΌ β
β ββββββββ βββββββββββββββββββββββββββββββββββββββ β
β β PFN β β Page Table Walk β β
β βobtainβ β (done by HW or SW) β β
β ββββ¬ββββ β 1. Access PML4 β β
β β β 2. Access PDPT β β
β β β 3. Access PD β β
β β β 4. Access PT β β
β β β 5. Store result in TLB β β
β β βββββββββββββββββ¬ββββββββββββββββββββββ β
β β β β
β βββββββββββββ¬ββββββββββββ β
β β β
β βΌ β
β Physical address = PFN + Offset β
β β β
β βΌ β
β Cache/Memory access β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
5.4 TLB Performance¶
TLB Hit Rate and AMAT:
Typical TLB hit rate: 99% or higher
AMAT Calculation:
AMAT = TLB_Hit_Rate Γ (TLB_Time + Mem_Time) +
TLB_Miss_Rate Γ (TLB_Time + Walk_Time + Mem_Time)
Example:
- TLB access: 1 cycle
- Page walk: 100 cycles (4 memory accesses)
- Memory access: 100 cycles
- TLB hit rate: 99%
TLB hit: 1 + 100 = 101 cycles
TLB miss: 1 + 100 + 100 = 201 cycles
AMAT = 0.99 Γ 101 + 0.01 Γ 201
= 99.99 + 2.01
= 102 cycles
Without TLB: 500 cycles (5 memory accesses)
β TLB provides ~5x performance improvement
TLB Coverage:
- 64-entry TLB, 4KB pages: 256KB
- 64-entry TLB, 2MB pages: 128MB
- Large pages increase TLB efficiency
6. Page Faults and Page Replacement¶
6.1 Page Fault¶
Definition: Situation where accessed page is not in physical memory
Page Fault Causes:
1. Page is on disk (swapped out)
2. Accessing unallocated address
3. Permission violation (write to read-only page, etc.)
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Page Fault Handling Process β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β 1. CPU accesses virtual address β
β β β
β βΌ β
β 2. MMU checks PTE: Present = 0 β
β β β
β βΌ β
β 3. Page Fault Exception raised β
β β β
β βΌ β
β 4. OS Page Fault Handler executes β
β β β
β βββββββ΄ββββββββββββββββββββββββββββββββββ β
β β β β
β βΌ βΌ β
β Valid access? Invalid access β
β β β β
β βΌ βΌ β
β 5. Find free frame Raise SIGSEGV β
β (replace page if none) (terminate proc) β
β β β
β βΌ β
β 6. Read page from disk (~10ms) β
β β β
β βΌ β
β 7. Update page table (Present = 1) β
β β β
β βΌ β
β 8. Re-execute instruction β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
6.2 Page Fault Cost¶
Page Fault Time:
- Disk read: ~10ms (HDD) or ~0.1ms (SSD)
- CPU clock: 3GHz = 0.33ns/cycle
Cycle count:
- HDD: 10ms / 0.33ns = 30,000,000 cycles
- SSD: 0.1ms / 0.33ns = 300,000 cycles
Normal memory access: ~100 cycles
Performance Impact Calculation:
For page fault rate p:
EAT = (1-p) Γ 100ns + p Γ 10ms
= 100ns + p Γ 10,000,000ns
To limit performance degradation to 10%:
110ns β₯ 100ns + p Γ 10,000,000ns
p β€ 0.000001 (one in a million)
β Page fault rate must be very low!
6.3 Demand Paging vs Prepaging¶
Demand Paging:
- Load pages only when needed
- Low memory usage initially
- Page fault on first access
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β At process start: β
β - Code pages not loaded β
β - Page fault on first instruction β
β - Only needed pages loaded β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Prepaging:
- Preload related pages
- Reduced initial page faults
- May load unused pages
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β At process start: β
β - Predict and preload Working Set β
β - Preload all or part of code region β
β - Fewer initial page faults β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
7. Page Replacement Algorithms¶
7.1 FIFO (First-In, First-Out)¶
Principle: Replace the page that entered first
Example: 3 frames, access order 1,2,3,4,1,2,5,1,2,3,4,5
ββββββ¬ββββββββββ¬ββββββββββ¬ββββββββββ¬ββββββββ
βRef β Frame 0 β Frame 1 β Frame 2 βResult β
ββββββΌββββββββββΌββββββββββΌββββββββββΌββββββββ€
β 1 β 1 β - β - β Miss β
β 2 β 1 β 2 β - β Miss β
β 3 β 1 β 2 β 3 β Miss β
β 4 β 4 β 2 β 3 β Miss β β 1 replaced
β 1 β 4 β 1 β 3 β Miss β β 2 replaced
β 2 β 4 β 1 β 2 β Miss β β 3 replaced
β 5 β 5 β 1 β 2 β Miss β β 4 replaced
β 1 β 5 β 1 β 2 β Hit β
β 2 β 5 β 1 β 2 β Hit β
β 3 β 3 β 1 β 2 β Miss β β 5 replaced
β 4 β 3 β 4 β 2 β Miss β β 1 replaced
β 5 β 3 β 4 β 5 β Miss β β 2 replaced
ββββββ΄ββββββββββ΄ββββββββββ΄ββββββββββ΄ββββββββ
Total misses: 10, Hits: 2
Belady's Anomaly:
- In FIFO, miss rate can increase with more frames
- Counter-intuitive phenomenon
7.2 Optimal (OPT)¶
Principle: Replace the page that won't be used for longest time
Example: 3 frames, access order 1,2,3,4,1,2,5,1,2,3,4,5
ββββββ¬ββββββββββ¬ββββββββββ¬ββββββββββ¬ββββββββ¬βββββββββββββββββ
βRef β Frame 0 β Frame 1 β Frame 2 βResult β Future refs β
ββββββΌββββββββββΌββββββββββΌββββββββββΌββββββββΌβββββββββββββββββ€
β 1 β 1 β - β - β Miss β β
β 2 β 1 β 2 β - β Miss β β
β 3 β 1 β 2 β 3 β Miss β β
β 4 β 1 β 2 β 4 β Miss β 3 used latest β
β 1 β 1 β 2 β 4 β Hit β β
β 2 β 1 β 2 β 4 β Hit β β
β 5 β 1 β 2 β 5 β Miss β 4 not used againβ
β 1 β 1 β 2 β 5 β Hit β β
β 2 β 1 β 2 β 5 β Hit β β
β 3 β 1 β 2 β 3 β Miss β 5 used latest β
β 4 β 1 β 4 β 3 β Miss β 2 used latest β
β 5 β 1 β 4 β 5 β Miss β 3 not used againβ
ββββββ΄ββββββββββ΄ββββββββββ΄ββββββββββ΄ββββββββ΄βββββββββββββββββ
Total misses: 8, Hits: 4
Optimal but not implementable (requires future prediction)
β Used only as benchmark
7.3 LRU (Least Recently Used)¶
Principle: Replace the page used longest ago
Example: 3 frames, access order 1,2,3,4,1,2,5,1,2,3,4,5
ββββββ¬ββββββββββ¬ββββββββββ¬ββββββββββ¬ββββββββ¬βββββββββββββββββ
βRef β Frame 0 β Frame 1 β Frame 2 βResult β LRU order β
ββββββΌββββββββββΌββββββββββΌββββββββββΌββββββββΌβββββββββββββββββ€
β 1 β 1 β - β - β Miss β 1 β
β 2 β 1 β 2 β - β Miss β 1,2 β
β 3 β 1 β 2 β 3 β Miss β 1,2,3 β
β 4 β 4 β 2 β 3 β Miss β 2,3,4 (1 repl) β
β 1 β 4 β 1 β 3 β Miss β 3,4,1 (2 repl) β
β 2 β 4 β 1 β 2 β Miss β 4,1,2 (3 repl) β
β 5 β 5 β 1 β 2 β Miss β 1,2,5 (4 repl) β
β 1 β 5 β 1 β 2 β Hit β 2,5,1 β
β 2 β 5 β 1 β 2 β Hit β 5,1,2 β
β 3 β 3 β 1 β 2 β Miss β 1,2,3 (5 repl) β
β 4 β 3 β 4 β 2 β Miss β 2,3,4 (1 repl) β
β 5 β 3 β 4 β 5 β Miss β 3,4,5 (2 repl) β
ββββββ΄ββββββββββ΄ββββββββββ΄ββββββββββ΄ββββββββ΄βββββββββββββββββ
Total misses: 10, Hits: 2
Implementation methods:
1. Timestamp: Record last access time for each page
2. Stack: Move page to stack top on access
3. Approximate LRU: Clock algorithm, etc.
7.4 Clock Algorithm (Second Chance)¶
Principle: FIFO + Reference Bit to approximate LRU
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β
β Clock Hand β
β β β
β ββββββββββ β
β βPage 0 β β
β β R=0 β β If R=0, replace β
β ββββ΄βββββββββ΄βββ β
β β β β
β βββββ΄ββββ βββββ΄ββββ β
β βPage 3 β βPage 1 β β
β β R=1 β β R=1 β β If R=1, set R=0 and skip β
β βββββ¬ββββ βββββ¬ββββ β
β β β β
β ββββ¬βββββββββ¬βββ β
β βPage 2 β β
β β R=1 β β
β ββββββββββ β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Algorithm:
1. Check page at clock hand
2. If R=0, replace that page
3. If R=1, set R=0 and move to next page
4. If all pages have R=1, full circle resets all and replaces R=0 page
Enhanced Clock (NRU):
- Add Modified(M) bit
- Priority: (R=0,M=0) > (R=0,M=1) > (R=1,M=0) > (R=1,M=1)
- Modified pages require disk write, so lower priority
7.5 Algorithm Comparison¶
ββββββββββββββββββ¬βββββββββββββββββββ¬ββββββββββββββββββββββββββ
β Algorithm β Advantages β Disadvantages β
ββββββββββββββββββΌβββββββββββββββββββΌββββββββββββββββββββββββββ€
β FIFO β Simple impl. β Belady's Anomaly β
β β β Unstable performance β
ββββββββββββββββββΌβββββββββββββββββββΌββββββββββββββββββββββββββ€
β Optimal β Optimal perf. β Not implementable β
β β Benchmark β (requires prediction) β
ββββββββββββββββββΌβββββββββββββββββββΌββββββββββββββββββββββββββ€
β LRU β Good performance β High impl. cost β
β β Stack property β Hardware support needed β
ββββββββββββββββββΌβββββββββββββββββββΌββββββββββββββββββββββββββ€
β Clock β Efficient approx.β Slightly lower than LRU β
β β Easy impl. β β
ββββββββββββββββββΌβββββββββββββββββββΌββββββββββββββββββββββββββ€
β Working Set β Prevents thrash β Hard to set parameters β
β β Adaptive β β
ββββββββββββββββββ΄βββββββββββββββββββ΄ββββββββββββββββββββββββββ
8. Advanced Topics¶
8.1 Copy-on-Write (COW)¶
Principle: Share pages on fork() instead of copying
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Copy-on-Write β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β Right after fork(): β
β Parent Child β
β βββββββββββ βββββββββββ β
β β Page 0 ββββββββ¬βββββ Page 0 β β
β β (R/O) β β β (R/O) β β
β βββββββββββ€ β βββββββββββ€ β
β β Page 1 ββββ β ββββ Page 1 β β
β β (R/O) β β β β β (R/O) β β
β βββββββββββ β β β βββββββββββ β
β β β β β
β β βΌ β β
β β βββββββ β
β β βFrameβ Physical Memory β
β β β A β (shared) β
β β βββββββ β
β β β² β
β βββββ β
β β
β When Child writes to Page 1: β
β βββββββββββ βββββββββββ β
β β Page 1 β β Page 1 β β
β β (R/W) β β (R/W) β β
β ββββββ¬βββββ ββββββ¬βββββ β
β β β β
β βΌ βΌ β
β βββββββββββ βββββββββββ β
β β Frame A β β Frame B β β Copy created β
β β(original)β β(modified)β β
β βββββββββββ βββββββββββ β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Advantages:
- fork() is fast (copy deferred)
- Memory savings (stay shared if read-only)
- Efficient when exec() discards all pages
8.2 Memory Mapped Files (mmap)¶
Principle: Map file directly to virtual address space
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β
β Process Virtual Address Space Disk File β
β β
β 0x40000000 ββββββββββββββ ββββββββββββββ β
β β β β β β
β β Mapped ββββββΆβ File β β
β β Region β β Content β β
β β β β β β
β 0x40100000 ββββββββββββββ ββββββββββββββ β
β β
β Access method: β
β - Access file content like memory (using pointers) β
β - Auto-load from file on page fault β
β - Auto-write to file on modification (or explicit msync) β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Uses:
- Large file processing
- Shared memory between processes
- Dynamic library loading
8.3 Huge Pages¶
Normal Pages vs Huge Pages:
Normal Page (4KB):
- 2MB memory = 512 pages
- 512 TLB entries needed (TLB pressure)
- Large page table overhead
Huge Page (2MB):
- 2MB memory = 1 page
- Only 1 TLB entry needed
- Increased TLB coverage
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β TLB Coverage Comparison (64 TLB entries): β
β β
β 4KB pages: 64 Γ 4KB = 256KB β
β 2MB pages: 64 Γ 2MB = 128MB β
β 1GB pages: 64 Γ 1GB = 64GB β
β β
β Big benefit for databases, virtualization, large memory β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Using in Linux:
# System configuration
echo 1024 > /proc/sys/vm/nr_hugepages
# In program
mmap(..., MAP_HUGETLB, ...);
9. Practice Problems¶
Basic Problems¶
-
What are 3 main advantages of virtual memory?
-
For a 32-bit system with 4KB pages, what is the VPN and Offset of virtual address 0x12345ABC?
-
Explain the role and necessity of TLB.
Intermediate Problems¶
-
Calculate the miss count for FIFO and LRU algorithms with the following access order: (3 frames, access: 7,0,1,2,0,3,0,4,2,3)
-
How many memory accesses occur on TLB miss with a 2-level page table?
-
Explain the operation principle and advantages of Copy-on-Write.
Advanced Problems¶
-
Why do 64-bit systems use 4-level page tables?
-
Calculate the Effective Access Time for the following scenario:
- TLB hit time: 10ns
- TLB hit rate: 98%
- Page table walk: 200ns (4 memory accesses)
- Memory access: 100ns
- Page fault rate: 0.001%
-
Page fault handling: 10ms
-
Explain how the Working Set model prevents thrashing.
Answers
1. Virtual Memory Advantages: - Run programs larger than physical memory - Memory protection (process isolation) - Memory sharing (libraries, etc.) - Fragmentation solved - Relocation transparency 2. Address decomposition (0x12345ABC): - VPN = 0x12345 (upper 20 bits) - Offset = 0xABC (lower 12 bits) 3. TLB Role: - Caches virtual-to-physical address translation results - Reduces page table access overhead - Necessity: 4-level table = 4 additional memory accesses reduced to 1 cycle 4. Miss count: FIFO: 7,0,1,2,0,3,0,4,2,3 Actual calculation: [7,-,-] β [7,0,-] β [7,0,1] β [2,0,1] β Hit β [2,3,1] β [2,3,0] β [4,3,0] β [4,2,0] β [4,2,3] FIFO: 9 Miss, 1 Hit LRU: [7,-,-] β [7,0,-] β [7,0,1] β [2,0,1] β Hit(0) β [2,0,3] β Hit(0) β [4,0,3] β [4,0,2] β [4,3,2] LRU: 8 Miss, 2 Hit 5. 2-level page table TLB miss: - Page Directory access: 1 - Page Table access: 1 - Actual data access: 1 - Total 3 memory accesses 6. Copy-on-Write: - Principle: On fork(), only copy page tables, share actual pages; copy on write - Advantages: fork() is fast, memory savings, efficient for exec() 7. 4-level page table reason: - 64-bit address space is too large - Single level: 2^52 / 4KB Γ 8B = several PB table - Multi-level: Only allocate tables for used regions, saving memory 8. EAT Calculation: TLB hit: 10ns + 100ns = 110ns TLB miss: 10ns + 200ns + 100ns = 310ns Page fault: 10ms = 10,000,000ns EAT = 0.98 Γ 0.99999 Γ 110 + 0.02 Γ 0.99999 Γ 310 + 0.00001 Γ 10,000,000 = 107.78 + 6.20 + 100 β 214ns 9. Working Set Model: - Tracks Working Set size of each process - If physical memory < sum of all process Working Sets - Swap out some processes to give others enough memory - Each process guaranteed frames for its Working Set, preventing thrashingNext Steps¶
- 17_IO_Systems.md - I/O systems, interrupts, DMA
References¶
- Operating System Concepts (Silberschatz et al.)
- Computer Architecture: A Quantitative Approach (Hennessy & Patterson)
- Intel Software Developer's Manual - Paging
- Linux Kernel Documentation - Memory Management