Asynchronous Handshaking Protocol
---
Peripheral Devices Reference
| Device | Type | Interface | Typical Data Rate | Notes |
|---|---|---|---|---|
| Keyboard | Input | USB / PS/2 | 10 KB/s | Human-speed, interrupt-driven |
| Mouse | Input | USB | 1 MB/s | Movement and button events |
| Monitor | Output | HDMI / DisplayPort | 2–8 GB/s | 4K@60Hz requires ~12 Gbps |
| Printer | Output | USB / Ethernet | 1–10 MB/s | Slow compared to storage |
| Scanner | Input | USB | 10–100 MB/s | Resolution-dependent |
| HDD | Storage | SATA | 100–200 MB/s | Mechanical; sequential fast |
| SSD (SATA) | Storage | SATA | 500 MB/s | Flash; random access fast |
| SSD (NVMe) | Storage | PCIe 4.0 x4 | 5–7 GB/s | Direct PCIe lane; very fast |
---
Programmed I/O (Polling / Busy-Wait)
The CPU continuously checks (polls) the device status register until the device is ready:
RTL Sequence:
- CPU writes command to I/O device control register
- Loop: CPU reads device status register
- If status bit = BUSY: goto step 2 (busy-wait loop)
- If status bit = READY: CPU reads/writes data register
Drawback: CPU is stuck in the polling loop, wasting 100% of CPU time while waiting for slow devices.
---
Interrupt-Driven I/O
Instead of busy-waiting, the CPU:
- Issues command to I/O device
- Continues executing other processes
- When device completes, it sends an interrupt request (IRQ) to CPU
- CPU finishes current instruction, saves context (PC, registers to stack)
- CPU jumps to Interrupt Service Routine (ISR) via interrupt vector table
- ISR transfers data; signals completion
- CPU restores context and resumes interrupted program
---
DMA (Direct Memory Access)
For high-speed bulk transfers (disk, network), even interrupt-driven I/O is too slow (CPU must service every byte). DMA Controller transfers data directly between device and memory without CPU involvement:
- CPU programs DMA controller (source address, destination address, byte count, direction)
- CPU releases memory bus; DMA controller takes over
- DMA transfers entire data block to/from memory
- DMA sends a single interrupt to CPU when transfer is complete
- CPU resumes
---
Comparison: Programmed I/O vs Interrupt I/O vs DMA
| Feature | Programmed I/O | Interrupt-Driven I/O | DMA |
|---|---|---|---|
| CPU involvement | 100% (busy-waiting) | Only during ISR | Only to program DMA + final interrupt |
| CPU overhead | Very High | Medium | Low |
| Speed | Limited by CPU loop | Better (async) | Fast (memory-speed) |
| Transfer unit | Byte or word | Byte or word | Block (KB–MB) |
| Hardware | Simplest | Needs interrupt controller | Needs DMA controller chip |
| Best use case | Very simple/embedded systems | Low-speed devices (keyboard) | High-speed devices (disk, NIC) |
---
Memory-Mapped I/O vs Isolated (Port) I/O
| Feature | Memory-Mapped I/O | Isolated (Port) I/O |
|---|---|---|
| Address space | I/O registers share memory address space | Separate I/O address space |
| Instructions | Standard LOAD/STORE | Special IN/OUT instructions |
| Address bits | Some memory addresses reserved for I/O | Full memory space available |
| Architecture | ARM, MIPS (memory-mapped only) | x86 (supports both) |
| Advantage | Simpler software; can use all addressing modes | Doesn't consume memory address space |
---
Study Deep: NVMe SSDs and PCIe
Traditional storage used SATA interface (max ~600 MB/s). NVMe SSDs connect directly to the CPU via PCIe lanes:
- PCIe 4.0 x4: up to 7 GB/s sequential read
- PCIe 5.0 x4: up to 14 GB/s sequential read
- Much lower queue depth latency: 2–10 µs vs 0.1 ms for SATA SSD
This bypasses legacy I/O bottlenecks (AHCI, SATA controller) entirely, making storage nearly as fast as DRAM for sequential workloads.
📝 Exam Tips: - Programmed I/O = CPU wastes time polling; simplest hardware - Interrupt I/O = CPU free during transfer; needs interrupt controller (PIC/APIC) - DMA = bulk transfer without CPU; needs DMA controller - Memory-mapped I/O: I/O ports appear as memory addresses — use regular MOV instructions