I/O Systems
I/O Systems¶
Overview¶
Input/Output (I/O) systems handle data transfer between the CPU and external devices (keyboard, disk, network, etc.). The design of the I/O system significantly impacts overall system performance and is implemented using various methods such as polling, interrupts, and DMA. This lesson covers the structure and operating principles of I/O systems.
Difficulty: βββ
Prerequisites: CPU architecture, memory systems
Table of Contents¶
- I/O System Overview
- Programmed I/O (Polling)
- Interrupt-Driven I/O
- DMA (Direct Memory Access)
- Bus Architecture
- I/O Interfaces
- Modern I/O Systems
- Practice Problems
1. I/O System Overview¶
1.1 Diversity of I/O Devices¶
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β I/O Device Classification β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β Input Devices Output Devices Storage Devicesβ
β ββββββββββββ ββββββββββββ ββββββββββββ β
β β Keyboard β β Monitor β β HDD/SSD β β
β β Mouse β β Printer β β USB β β
β β Scanner β β Speaker β β SD Card β β
β β Microphoneβ β LED β β Optical β β
β β Touch β β β β Drive β β
β ββββββββββββ ββββββββββββ ββββββββββββ β
β β
β Communication Devices β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β Network Card (NIC), WiFi, Bluetooth, USB Hub ββ
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
I/O Device Characteristics:
βββββββββββββββββββ¬βββββββββββββββ¬ββββββββββββββββββββββββββββ
β Device β Data Rate β Characteristics β
βββββββββββββββββββΌβββββββββββββββΌββββββββββββββββββββββββββββ€
β Keyboard β ~100 B/s β Slow, async, character β
βββββββββββββββββββΌβββββββββββββββΌββββββββββββββββββββββββββββ€
β Mouse β ~1 KB/s β Slow, sync, event-based β
βββββββββββββββββββΌβββββββββββββββΌββββββββββββββββββββββββββββ€
β Gigabit Ethernetβ 125 MB/s β High-speed, packet-based β
βββββββββββββββββββΌβββββββββββββββΌββββββββββββββββββββββββββββ€
β SATA SSD β ~600 MB/s β High-speed, block-based β
βββββββββββββββββββΌβββββββββββββββΌββββββββββββββββββββββββββββ€
β NVMe SSD β ~7 GB/s β Ultra-fast, parallel β
βββββββββββββββββββΌβββββββββββββββΌββββββββββββββββββββββββββββ€
β 4K Display β ~20 GB/s β Ultra-fast, streaming β
βββββββββββββββββββ΄βββββββββββββββ΄ββββββββββββββββββββββββββββ
1.2 I/O System Structure¶
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β I/O System Layers β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Application β β
β β read(), write(), printf() β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β
β βΌ β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Operating System I/O Subsystem β β
β β Buffering, Caching, Spooling, Scheduling β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β
β βΌ β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Device Driver β β
β β Device-specific control code, interrupt handling β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β
β βΌ β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Hardware Controller β β
β β I/O ports, registers, bus interface β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β
β βΌ β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β I/O Device β β
β β Physical device (disk, keyboard, etc.) β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
1.3 I/O Control Method Comparison¶
βββββββββββββββββββββ¬βββββββββββββββββββββ¬βββββββββββββββββββββ
β β Programmed I/O β Interrupt I/O β
β Characteristic β (Polling) β β
βββββββββββββββββββββΌβββββββββββββββββββββΌβββββββββββββββββββββ€
β CPU Involvement β High β Medium β
βββββββββββββββββββββΌβββββββββββββββββββββΌβββββββββββββββββββββ€
β CPU Efficiency β Low β High β
βββββββββββββββββββββΌβββββββββββββββββββββΌβββββββββββββββββββββ€
β Implementation β Low β Medium β
β Complexity β β β
βββββββββββββββββββββΌβββββββββββββββββββββΌβββββββββββββββββββββ€
β Suitable Devices β Fast, predictable β Slow, async β
βββββββββββββββββββββΌβββββββββββββββββββββΌβββββββββββββββββββββ€
β Data Transfer β Via CPU β Via CPU β
βββββββββββββββββββββ΄βββββββββββββββββββββ΄βββββββββββββββββββββ
βββββββββββββββββββββ¬βββββββββββββββββββββ
β β DMA β
β Characteristic β β
βββββββββββββββββββββΌβββββββββββββββββββββ€
β CPU Involvement β Low β
βββββββββββββββββββββΌβββββββββββββββββββββ€
β CPU Efficiency β Very High β
βββββββββββββββββββββΌβββββββββββββββββββββ€
β Implementation β High β
β Complexity β β
βββββββββββββββββββββΌβββββββββββββββββββββ€
β Suitable Devices β Bulk transfer β
βββββββββββββββββββββΌβββββββββββββββββββββ€
β Data Transfer β Direct memory β
βββββββββββββββββββββ΄βββββββββββββββββββββ
2. Programmed I/O (Polling)¶
2.1 Polling Concept¶
Definition: CPU periodically checks I/O device status for data transfer
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Polling Operation β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β CPU I/O Device β
β β β β
β β 1. Check status (ready?) β β
β ββββββββββββββββββββββββββββββββΆβ β
β β β β
β β 2. "Not ready" β β
β βββββββββββββββββββββββββββββββββ β
β β β β
β β 3. Check status (again) β β
β ββββββββββββββββββββββββββββββββΆβ β
β β β β
β β 4. "Not ready" β β
β βββββββββββββββββββββββββββββββββ β
β β ...repeat... β β
β β N. Check status β β
β ββββββββββββββββββββββββββββββββΆβ β
β β β β
β β N+1. "Ready" β β
β βββββββββββββββββββββββββββββββββ β
β β β β
β β N+2. Read data β β
β βββββββββββββββββββββββββββββββββ β
β β β β
β β
β Problem: CPU waits and does nothing (Busy Waiting) β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
2.2 I/O Ports and Registers¶
I/O Device Controller Registers:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β I/O Controller Registers β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Status Register β β
β β βββββββ¬ββββββ¬ββββββ¬ββββββ¬ββββββ¬ββββββ¬ββββββ¬ββββββ β β
β β βBusy βReadyβErrorβ IRQ β ... β ... β ... β ... β β β
β β βββββββ΄ββββββ΄ββββββ΄ββββββ΄ββββββ΄ββββββ΄ββββββ΄ββββββ β β
β β - Displays device status (read-only) β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Control Register β β
β β βββββββ¬ββββββ¬ββββββ¬ββββββ¬ββββββ¬ββββββ¬ββββββ¬ββββββ β β
β β βStartβ IE βMode β Dir β ... β ... β ... β ... β β β
β β βββββββ΄ββββββ΄ββββββ΄ββββββ΄ββββββ΄ββββββ΄ββββββ΄ββββββ β β
β β - Device control commands (write) β β
β β - IE: Interrupt Enable β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Data Register β β
β β βββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β Data (8/16/32 bits) β β β
β β βββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β - Actual transfer data β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
2.3 Polling Programming Example¶
// Simple polling-based I/O code
#define STATUS_REG 0x3F8 // Status register address
#define DATA_REG 0x3F9 // Data register address
#define READY_BIT 0x01 // Ready bit mask
// Character output (polling)
void putchar_polling(char c) {
// Wait until device is ready (Busy Wait)
while ((inb(STATUS_REG) & READY_BIT) == 0) {
// CPU keeps looping and checking
// Unable to do anything else
}
// Transfer data
outb(DATA_REG, c);
}
// String output
void print_string_polling(const char* str) {
while (*str) {
putchar_polling(*str++);
}
}
// Character input (polling)
char getchar_polling(void) {
// Wait until input data is available
while ((inb(STATUS_REG) & READY_BIT) == 0) {
// Busy Wait
}
return inb(DATA_REG);
}
2.4 Advantages and Disadvantages of Polling¶
Advantages:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β - Simple implementation β
β - Minimal hardware requirements β
β - Predictable timing β
β - Efficient for fast devices (when data is ready instantly) β
β - Minimizes jitter in real-time systems β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Disadvantages:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β - Wastes CPU time (Busy Waiting) β
β - Very inefficient for slow devices β
β - Difficult to handle multiple devices β
β - Increased power consumption β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
CPU Time Waste Calculation:
- Serial port: 115200 bps = 11520 characters/sec
- Per character: ~87us
- 3GHz CPU: 87us = 261,000 cycles
- CPU waits 260,000 cycles to transfer one character!
3. Interrupt-Driven I/O¶
3.1 Interrupt Concept¶
Definition: I/O device asynchronously signals CPU to request processing
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Interrupt Operation β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β CPU (performing tasks) I/O Device β
β β β β
β β 1. Issue I/O command β β
β βββββββββββββββββββββββββββββββββββββββΆβ β
β β β β
β β 2. Perform other tasks β β
β β (Don't wait for I/O completion) β β
β β β β
β β ...time passes... β 3. Process I/Oβ
β β β β
β β 4. Interrupt signal (IRQ) β β
β ββββββββββββββββββββββββββββββββββββββββ β
β β β β
β β 5. Save current state β β
β β 6. Execute interrupt handler β β
β β 7. Transfer data β β
β β 8. Resume original task β β
β β β β
β β
β Advantage: CPU can perform other tasks while waiting β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
3.2 Interrupt Processing Steps¶
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Detailed Interrupt Processing Steps β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β 1. Device activates IRQ line β
β βββ Signal sent to interrupt controller (PIC/APIC) β
β β
β 2. Interrupt controller sends interrupt request to CPU β
β βββ Check interrupt priority β
β βββ Forward to CPU if not masked β
β β
β 3. CPU checks interrupt after completing current instructionβ
β βββ Check interrupt flag (IF) β
β βββ Start processing if interrupts enabled β
β β
β 4. Save CPU state β
β βββ Flags register β stack β
β βββ CS:IP (or RIP) β stack β
β βββ Disable interrupts (prevent nesting) β
β β
β 5. Reference interrupt vector table β
β βββ Look up handler address by interrupt number β
β β
β 6. Execute Interrupt Service Routine (ISR) β
β βββ Execute device-specific handling code β
β βββ Send interrupt acknowledge signal β
β β
β 7. Send EOI (End of Interrupt) β
β βββ Notify interrupt controller processing complete β
β β
β 8. Return with IRET instruction β
β βββ Restore saved state β
β βββ Return to original code β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
3.3 Interrupt Vector Table¶
x86 Interrupt Vector Table (IDT):
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Interrupt Descriptor Table (IDT) β
ββββββββ¬βββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β Vec β Description β
ββββββββΌβββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β 0 β Divide Error (#DE) β
β 1 β Debug Exception (#DB) β
β 2 β NMI (Non-Maskable Interrupt) β
β 3 β Breakpoint (#BP) β
β 6 β Invalid Opcode (#UD) β
β 8 β Double Fault (#DF) β
β 13 β General Protection Fault (#GP) β
β 14 β Page Fault (#PF) β
β ... β ... β
β 32 β IRQ 0: Timer (PIT) β
β 33 β IRQ 1: Keyboard β
β 34 β IRQ 2: Cascade (PIC2 connection) β
β 35 β IRQ 3: COM2/COM4 β
β 36 β IRQ 4: COM1/COM3 β
β ... β ... β
β 46 β IRQ 14: Primary IDE β
β 47 β IRQ 15: Secondary IDE β
β ... β ... β
β 128 β System Call (Linux: int 0x80) β
β ... β ... β
β 255 β Reserved β
ββββββββ΄βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
IDT Entry Structure (64-bit):
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β 63 48 47 46 44 43 40 39 35 34 32 β
β βββββββββββββΌβββΌβββββββΌβββββββββΌββββββΌββββββ€ β
β β Offset βP β DPL β Type β IST β 0 β β
β β [63:48] β β β β β β β
β βββββββββββββ΄βββ΄βββββββ΄βββββββββ΄ββββββ΄ββββββ β
β 31 16 15 0 β
β βββββββββββββββββββΌββββββββββββββββββ€ β
β β Segment Sel. β Offset [15:0] β β
β βββββββββββββββββββ΄ββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
3.4 Interrupt-Based I/O Programming¶
// Interrupt-based keyboard driver example
#define KEYBOARD_IRQ 1
#define KEYBOARD_PORT 0x60
// Keyboard buffer
volatile char keyboard_buffer[256];
volatile int buffer_head = 0;
volatile int buffer_tail = 0;
// Interrupt handler (ISR)
void keyboard_handler(void) {
// 1. Read scancode
unsigned char scancode = inb(KEYBOARD_PORT);
// 2. Store in buffer
keyboard_buffer[buffer_head] = scancode;
buffer_head = (buffer_head + 1) % 256;
// 3. Send EOI (notify interrupt complete)
outb(0x20, 0x20); // EOI to PIC
}
// Read character (blocking)
char getchar_interrupt(void) {
// Wait until data is in buffer
// (In practice, use sleep/wakeup)
while (buffer_tail == buffer_head) {
// CPU sleep or run other processes
asm("hlt"); // Halt until interrupt
}
char c = keyboard_buffer[buffer_tail];
buffer_tail = (buffer_tail + 1) % 256;
return c;
}
// Register interrupt handler
void init_keyboard(void) {
// Register handler in IDT
set_interrupt_handler(32 + KEYBOARD_IRQ, keyboard_handler);
// Enable interrupt
enable_irq(KEYBOARD_IRQ);
}
3.5 Advantages and Disadvantages of Interrupts¶
Advantages:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β - Improved CPU efficiency (perform other tasks while waiting)β
β - Suitable for asynchronous event processing β
β - Easy to handle multiple devices β
β - Power efficient (CPU can enter sleep state) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Disadvantages:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β - Increased implementation complexity β
β - Interrupt overhead (context save/restore) β
β - Interrupt latency exists β
β - High overhead with frequent interrupts (Interrupt Storm) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Interrupt Overhead:
- State save: ~100 cycles
- Handler entry: ~50 cycles
- Cache/TLB effects: ~100+ cycles
- Total: ~500-1000 cycles/interrupt
100,000 interrupts/sec @ 3GHz:
Overhead = 100,000 Γ 500 / 3,000,000,000 β 1.7% CPU
4. DMA (Direct Memory Access)¶
4.1 DMA Concept¶
Definition: Direct data transfer between I/O device and memory without CPU
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β DMA vs CPU-based Transfer Comparison β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β CPU-based Transfer (Programmed I/O): β
β β
β Memory βββββ CPU βββββ I/O Device β
β read write β
β β
β - CPU involvement for every byte β
β - CPU handles data movement β
β - High CPU time consumption β
β β
β DMA Transfer: β
β β
β Memory ββββββββββββββββΆ I/O Device β
β β β
β β DMA Controller β
β βββββββββββ β
β β β
β CPU ββββββββββ (setup only) β
β β
β - CPU handles only transfer setup β
β - DMA controller performs transfer β
β - Notify via interrupt when transfer completes β
β - CPU can perform other tasks β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
4.2 DMA Operation Process¶
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β DMA Transfer Process β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β 1. CPU configures DMA controller β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β - Source address (memory or I/O) ββ
β β - Destination address ββ
β β - Transfer size (bytes) ββ
β β - Transfer direction (read/write) ββ
β β - Transfer mode (block/cycle stealing) ββ
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β
β 2. CPU issues DMA start command β
β β
β 3. DMA controller requests bus (Bus Request) β
β - Sends HOLD signal to CPU β
β β
β 4. CPU grants bus (Bus Grant) β
β - Yields bus control with HLDA signal β
β - CPU performs tasks that don't use the bus β
β β
β 5. DMA controller transfers data β
β ββββββββββ ββββββββββ β
β β Memory ββββββββββΆβ I/O β β
β ββββββββββ DMA ββββββββββ β
β Bus β
β β
β 6. When transfer completes β
β - Return bus β
β - Generate interrupt to CPU β
β β
β 7. CPU handles completion β
β - Check status β
β - Setup next transfer if needed β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
4.3 DMA Controller Structure¶
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β DMA Controller β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Control Registers β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€ β
β β βββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β Command Register β β β
β β β - DMA operation mode settings β β β
β β βββββββββββββββββββββββββββββββββββββββββββββββ β β
β β βββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β Mode Register β β β
β β β - Transfer direction, mode (single/block/demand)β β β
β β βββββββββββββββββββββββββββββββββββββββββββββββ β β
β β βββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β Status Register β β β
β β β - Completion status, request status β β β
β β βββββββββββββββββββββββββββββββββββββββββββββββ β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Channel 0 β β
β β βββββββββββββββ βββββββββββββββ βββββββββββββββ β β
β β β Address Reg β β Count Reg β β Page Reg β β β
β β β 0x0000 β β 1024 β β 0x00 β β β
β β βββββββββββββββ βββββββββββββββ βββββββββββββββ β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Channel 1 β β
β β ... (same structure) β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β ... Channel 2, 3, ... β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
4.4 DMA Transfer Modes¶
1. Block Transfer (Block/Burst Mode):
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β
β βββββββ¬βββββββββββββββββββββββββββββββββ¬ββββββ β
β CPU β DMA β CPU β
β usage β exclusive bus usage βusage β
β βββββββ΄βββββββββββββββββββββββββββββββββ΄ββββββ β
β β
β - Transfer entire block at once β
β - Fastest transfer β
β - CPU may wait long time β
β - Suitable for large data β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
2. Cycle Stealing:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β
β βββ¬βββ¬βββ¬βββ¬βββ¬βββ¬βββ¬βββ¬βββ¬βββ¬ββ β
β C βD βC βD βC βD βC βD βC βD βC β
β P βM βP βM βP βM βP βM βP βM βP β
β U βA βU βA βU βA βU βA βU βA βU β
β βββ΄βββ΄βββ΄βββ΄βββ΄βββ΄βββ΄βββ΄βββ΄βββ΄ββ β
β β
β - Transfer one word/byte at a time β
β - CPU and DMA alternate bus usage β
β - Minimal CPU impact β
β - Slower transfer speed β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
3. Demand Transfer (Demand Mode):
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β
β Transfer whenever device is ready (DREQ signal based) β
β Adapts to device speed β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
4.5 DMA Programming Example¶
// DMA disk read example (simplified)
#define DMA_CHANNEL 2
#define DMA_ADDR_REG 0x04 // Channel 2 address
#define DMA_COUNT_REG 0x05 // Channel 2 count
#define DMA_PAGE_REG 0x81 // Channel 2 page
#define DMA_MODE_REG 0x0B
#define DMA_MASK_REG 0x0A
// DMA transfer setup
void setup_dma_read(void* buffer, size_t count) {
uint32_t addr = (uint32_t)buffer;
// 1. Mask DMA channel (disable)
outb(DMA_MASK_REG, DMA_CHANNEL | 0x04);
// 2. Reset flip-flop
outb(0x0C, 0);
// 3. Set mode (read, channel 2, single mode)
outb(DMA_MODE_REG, 0x46);
// 4. Set address (lower 16 bits)
outb(DMA_ADDR_REG, addr & 0xFF);
outb(DMA_ADDR_REG, (addr >> 8) & 0xFF);
// 5. Set page (upper bits)
outb(DMA_PAGE_REG, (addr >> 16) & 0xFF);
// 6. Set count (count - 1)
outb(DMA_COUNT_REG, (count - 1) & 0xFF);
outb(DMA_COUNT_REG, ((count - 1) >> 8) & 0xFF);
// 7. Unmask DMA channel (enable)
outb(DMA_MASK_REG, DMA_CHANNEL);
}
// Disk read command + DMA
void read_disk_dma(void* buffer, uint32_t sector, uint16_t count) {
// Setup DMA
setup_dma_read(buffer, count * 512);
// Issue disk read command to disk controller
issue_disk_read_command(sector, count);
// CPU can perform other tasks
// Interrupt occurs when transfer completes
}
// DMA completion interrupt handler
void dma_complete_handler(void) {
// Handle transfer completion
// Check status, buffer available
signal_dma_complete();
}
5. Bus Architecture¶
5.1 Types of Buses¶
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β System Bus Structure β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β CPU β β
β βββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββ β
β β β
β Front-Side Bus β
β (or QPI/UPI) β
β β β
β βββββββββββββββββββββββββββ΄ββββββββββββββββββββββββββββ β
β β Memory Controller Hub β β
β β (Northbridge) β β
β βββββββββββββ¬ββββββββββββββ¬ββββββββββββββ¬ββββββββββββββ β
β β β β β
β Memory PCIe DMI β
β Bus x16 β
β β β β β
β βββββββββββββ΄ββββ ββββββ΄βββββ β β
β β DRAM β β GPU β β β
β βββββββββββββββββ βββββββββββ β β
β β β
β ββββββββββββββββββββββββββββββββββββββββ΄βββββββββββββββ β
β β I/O Controller Hub β β
β β (Southbridge) β β
β βββββββββ¬βββββββββββ¬βββββββββββ¬βββββββββββ¬ββββββββββββ β
β β β β β β
β SATA USB PCIe Audio β
β β β x1 β β
β βββββββ΄ββββββ ββββ΄βββ ββββ΄βββ βββββ΄ββββ β
β β HDD/SSD β β USB β β NIC β βCodec β β
β ββββββββββββ βDevs β βββββββ βββββββββ β
β βββββββ β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
5.2 Bus Characteristics¶
Bus Components:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Data Bus β β
β β - Data transfer lines β β
β β - Width: 8, 16, 32, 64 bits β β
β β - Bidirectional β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Address Bus β β
β β - Memory/I/O addressing β β
β β - Width: 20, 32, 36, 40+ bits β β
β β - Unidirectional (CPU β device) β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Control Bus β β
β β - Control signal transfer β β
β β - Read, Write, IRQ, DMA request, etc. β β
β β - Bidirectional β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Major Bus Standards:
ββββββββββββββββββ¬βββββββββββββββ¬ββββββββββββββββ¬ββββββββββββββ
β Bus β Bandwidth β Purpose β Features β
ββββββββββββββββββΌβββββββββββββββΌββββββββββββββββΌββββββββββββββ€
β PCIe 4.0 x16 β ~64 GB/s β GPU, high-speedβ Serial, lanesβ
ββββββββββββββββββΌβββββββββββββββΌββββββββββββββββΌββββββββββββββ€
β PCIe 5.0 x16 β ~128 GB/s β Next-gen GPU β Latest std β
ββββββββββββββββββΌβββββββββββββββΌββββββββββββββββΌββββββββββββββ€
β SATA III β ~600 MB/s β Storage β Serial β
ββββββββββββββββββΌβββββββββββββββΌββββββββββββββββΌββββββββββββββ€
β NVMe (PCIe 4) β ~7 GB/s β High-speed SSDβ Low latency β
ββββββββββββββββββΌβββββββββββββββΌββββββββββββββββΌββββββββββββββ€
β USB 3.2 Gen2 β ~1.25 GB/s β Peripherals β Universal β
ββββββββββββββββββΌβββββββββββββββΌββββββββββββββββΌββββββββββββββ€
β Thunderbolt 4 β ~5 GB/s β High-speed β Daisy-chain β
ββββββββββββββββββ΄βββββββββββββββ΄ββββββββββββββββ΄ββββββββββββββ
5.3 Bus Arbitration¶
Preventing collisions when multiple devices share bus:
1. Centralized Arbitration:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β
β ββββββββββ ββββββββββ ββββββββββ ββββββββββ β
β βDevice 0β βDevice 1β βDevice 2β βDevice 3β β
β βββββ¬βββββ βββββ¬βββββ βββββ¬βββββ βββββ¬βββββ β
β β REQ β REQ β REQ β REQ β
β β β β β β
β βββββββ¬ββββββ΄ββββββ¬ββββββ΄ββββββ¬ββββββ β
β β β β β
β βΌ βΌ βΌ β
β ββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Bus Arbiter β β
β β (Central arbiter) β β
β β - Receive requests β β
β β - Issue GRANT signal by priority β β
β ββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β β β
β GRANT β GRANT β GRANT β β
β βΌ βΌ βΌ β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
2. Priority Schemes:
- Fixed priority: Fixed priority per device
- Round robin: Fair allocation in rotation
- Dynamic priority: Adjust based on usage patterns
6. I/O Interfaces¶
6.1 I/O Addressing¶
1. Isolated I/O (Port-Mapped I/O):
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β
β Memory Address Space I/O Address Space β
β ββββββββββββββββββββββ ββββββββββββββββββββββ β
β β 0x0000_0000 β β 0x0000 β β
β β β β β β
β β Memory β β I/O Ports β β
β β β β β β
β β 0xFFFF_FFFF β β 0xFFFF β β
β ββββββββββββββββββββββ ββββββββββββββββββββββ β
β β
β - Uses separate address space β
β - Uses IN, OUT instructions β
β - Used in x86 architecture β
β β
β Example: β
β outb(0x3F8, data); // Write to port 0x3F8 β
β data = inb(0x3F8); // Read from port 0x3F8 β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
2. Memory-Mapped I/O (MMIO):
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β
β Unified Address Space β
β ββββββββββββββββββββββββββββββββββββββββββββββ β
β β 0x0000_0000 β β
β β β β
β β System Memory (RAM) β β
β β β β
β β 0x7FFF_FFFF β β
β ββββββββββββββββββββββββββββββββββββββββββββββ€ β
β β 0x8000_0000 β β
β β β β
β β I/O Device Registers β β
β β (accessed like memory) β β
β β β β
β β 0xFFFF_FFFF β β
β ββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β - Access I/O with normal memory instructions β
β - Mainly used in ARM, RISC-V, etc. β
β - Most devices in modern PCs also use MMIO β
β β
β Example: β
β volatile uint32_t* reg = (uint32_t*)0xFE200000; β
β *reg = value; // Write to I/O register β
β value = *reg; // Read from I/O register β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
6.2 Device Driver Structure¶
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Device Driver Structure β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Driver Entry Points β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββ€ β
β β init() - Driver initialization β β
β β open() - Open device β β
β β close() - Close device β β
β β read() - Read data β β
β β write() - Write data β β
β β ioctl() - Control commands β β
β β interrupt_handler() - Interrupt handling β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Driver Internal Data β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββ€ β
β β - Device state β β
β β - Buffers β β
β β - Wait queues β β
β β - Configuration info β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Linux device driver example:
static struct file_operations my_fops = {
.owner = THIS_MODULE,
.open = my_open,
.release = my_close,
.read = my_read,
.write = my_write,
.unlocked_ioctl = my_ioctl,
};
7. Modern I/O Systems¶
7.1 NVMe (Non-Volatile Memory Express)¶
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β NVMe Architecture β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β Traditional SATA/AHCI vs NVMe: β
β β
β SATA/AHCI: β
β - Single command queue (depth 32) β
β - Designed for HDD era β
β - High latency β
β β
β NVMe: β
β - 64K queues, 64K commands per queue β
β - Optimized for SSD β
β - Low latency β
β - Direct PCIe connection β
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β CPU / Driver β β
β βββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββ β
β β β
β ββββββββββββββββββΌβββββββββββββββββ β
β β β β β
β βΌ βΌ βΌ β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ β
β β Submit Q 0 β β Submit Q 1 β β Submit Q N β β
β ββββββββ¬βββββββ ββββββββ¬βββββββ ββββββββ¬βββββββ β
β β β β β
β ββββββββββββββββββΌβββββββββββββββββ β
β β β
β βΌ β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β NVMe Controller β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β
β ββββββββββββββββββΌβββββββββββββββββ β
β β β β β
β βΌ βΌ βΌ β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ β
β βComplete Q 0 β βComplete Q 1 β βComplete Q N β β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
7.2 USB System¶
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β USB Architecture β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β USB Generation Speeds: β
β ββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββ β
β β USB 2.0 β 480 Mbps (High Speed) β β
β β USB 3.0 β 5 Gbps (SuperSpeed) β β
β β USB 3.1 β 10 Gbps (SuperSpeed+) β β
β β USB 3.2 β 20 Gbps (SuperSpeed USB 20Gbps) β β
β β USB4 β 40 Gbps β β
β ββββββββββββββββ΄βββββββββββββββββββββββββββββββββββββββ β
β β
β USB Transfer Types: β
β ββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββ β
β β Control β Setup, control (small data) β β
β β Bulk β Large data (storage devices) β β
β β Interrupt β Small, periodic (keyboard, mouse) β β
β β Isochronous β Real-time, periodic (audio, video) β β
β ββββββββββββββββ΄βββββββββββββββββββββββββββββββββββββββ β
β β
β USB Topology: β
β β
β Host Controller β
β β β
β βΌ β
β Root Hub β
β / | \ β
β / | \ β
β Hub Device Device β
β / \ β
β / \ β
β Dev Dev β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
7.3 I/O Virtualization¶
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β I/O Virtualization Techniques β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β 1. Emulation: β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Guest OS β Hypervisor β Physical Device β β
β β (I/O trap and emulation) β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β - Slow, high compatibility β
β β
β 2. Para-virtualization (virtio): β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Guest OS (virtio driver) β Hypervisor β Device β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β - Optimized virtual interface β
β - Guest OS modification required β
β β
β 3. Direct Device Assignment (VFIO): β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Guest OS β Physical Device (direct access) β β
β β (Memory isolation via IOMMU) β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β - Native performance β
β - Device cannot be shared β
β β
β 4. SR-IOV (Single Root I/O Virtualization): β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Split one physical device into multiple virtual β β
β β functions (VF), each VM accesses independent VF β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β - Native performance + sharing possible β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
8. Practice Problems¶
Basic Problems¶
-
Explain the differences between polling, interrupt, and DMA methods.
-
What is the role of the interrupt vector table?
-
What are the 3 pieces of information that must be configured in a DMA controller?
Intermediate Problems¶
- Choose the appropriate I/O method for each scenario:
- (a) Keyboard input processing
- (b) Reading 10MB file from disk
-
(c) High-speed network packet processing (10Gbps)
-
What is an "Interrupt Storm" in interrupt-based I/O?
-
Compare the advantages and disadvantages of memory-mapped I/O vs port-mapped I/O.
Advanced Problems¶
- Calculate CPU efficiency under the following conditions:
- CPU clock: 3GHz
- Disk transfer rate: 500MB/s
- DMA block size: 4KB
-
DMA completion interrupt processing time: 1000 cycles
-
Explain USB's 4 transfer types and give device examples suitable for each.
-
Explain how SR-IOV improves I/O performance in virtualized environments.
Answers
1. I/O Method Comparison: - Polling: CPU continuously checks status, simple but wastes CPU - Interrupt: Device notifies completion, CPU efficient but has overhead - DMA: Direct memory transfer, efficient for bulk data but complex 2. Interrupt Vector Table: - Maps interrupt numbers to handler addresses - Jumps to corresponding handler when interrupt occurs 3. DMA Configuration Info: - Source/destination address - Transfer size (bytes) - Transfer direction (read/write) 4. Appropriate I/O Methods: - (a) Interrupt (slow, asynchronous) - (b) DMA (bulk block transfer) - (c) DMA + polling or NAPI (high-speed, many packets) 5. Interrupt Storm: - Situation where too many interrupts cause CPU to only handle interrupts - Cannot perform normal tasks - Solution: Interrupt coalescing, switch to polling (NAPI) 6. I/O Addressing Comparison: - Port-mapped: Separate address space, requires separate instructions, saves address space - Memory-mapped: Unified address space, uses normal instructions, cache considerations needed 7. CPU Efficiency Calculation: - Transfer rate 500MB/s, block 4KB β 125,000 transfers/sec - 1000 cycles interrupt handling per transfer - Total interrupt cycles: 125,000,000 - CPU efficiency: 1 - (125M / 3G) = 1 - 0.042 = 95.8% 8. USB Transfer Types: - Control: Device setup, status checking (all USB devices) - Bulk: Large data (USB drives, printers) - Interrupt: Small periodic (keyboard, mouse) - Isochronous: Real-time (webcam, USB audio) 9. SR-IOV Principle: - Creates multiple virtual functions (VF) on one physical device - Each VM directly accesses dedicated VF - Native performance without hypervisor intervention - IOMMU ensures memory isolationNext Steps¶
- 18_Parallel_Processing_Multicore.md - Multicore architecture and parallel programming
References¶
- Computer Organization and Design (Patterson & Hennessy)
- Operating System Concepts (Silberschatz et al.)
- NVMe Specification
- USB Specification
- Linux Device Drivers (Corbet, Rubini, Kroah-Hartman)