1.1 Why I/O Needs Its Own Architecture
Peripherals differ from the CPU in speed (a keyboard delivers bytes a million times slower than the CPU consumes them), data format, and electrical behaviour — so devices connect through interface units containing data/status/control registers. Two ways to address those registers: memory-mapped I/O (registers occupy normal addresses; any instruction can touch them) vs isolated (I/O-mapped) I/O (separate IN/OUT instructions and a separate address space) — know this distinction cold. Then the real question: how does data actually move? Three escalating techniques.
1.2 Programmed I/O (Polling)
The CPU runs a loop reading the device's status flag until it is ready, then transfers one word:
POLL: read status register
if READY = 0 goto POLL ; busy-wait ("spin")
transfer one word
if more words goto POLL
Cost analysis (the "why it's bad"): a device delivering 100 bytes/s keeps a 1 GHz CPU spinning ~10 million cycles per byte, doing nothing useful. Acceptable only for trivial controllers or very fast dedicated transfers. CPU is a slave to the device's timetable.
1.3 Interrupt-Driven I/O
Invert control: the CPU starts the operation and continues other work; the device interrupts when ready. The CPU finishes the current instruction, saves state (PC, flags), runs the ISR (interrupt service routine) to transfer the word, restores state and resumes.
Interrupt taxonomy:
| Classification | Types |
|---|---|
| By source | Hardware (device signal) vs Software (INT instruction, syscalls) vs Exceptions/Traps (divide-by-zero, page fault) |
| By maskability | Maskable (can be disabled via IEN/flag) vs Non-maskable NMI (power failure — cannot be ignored) |
| By vectoring | Vectored (device supplies its ISR address/vector) vs Non-vectored (fixed location, CPU polls to find the source) |
1.4 Priorities: Who Gets Served First?
Multiple devices can interrupt simultaneously; priority resolution is the exam's favourite corner:
- Software polling: ISR checks devices in a fixed order — order = priority; cheap but slow.
- Daisy chaining (serial hardware): the interrupt-acknowledge line threads device to device:
CPU --INTACK--> [Device 1] --PO--> [Device 2] --PO--> [Device 3]
highest priority ^ ^ lowest
Rule: a requesting device blocks the acknowledge (PO = 0)
and puts its vector address (VAD) on the bus;
a non-requesting device passes PI through (PO = PI).
Priority = position in the chain. Pros: trivially cheap, one line. Cons: fixed priority, far devices can starve, a broken device breaks the chain.
- Parallel priority: each device gets its own request line into an interrupt register; a mask register enables/disables lines individually; a priority encoder outputs the highest active line's vector. Fast and flexible, more hardware.
1.5 DMA: Direct Memory Access
Even interrupt-driven I/O drags every byte through the CPU. For block devices (disk, network), the DMA controller (DMAC) transfers data memory↔device directly. Programming model: CPU writes the DMAC's address register (buffer start), word-count register, and mode; the DMAC then requests the bus (BR/HOLD), the CPU grants it (BG/HLDA) after floating its buses, data flows without CPU involvement, and one interrupt fires at completion (word count = 0).
| DMA Mode | Behaviour | Best For |
|---|---|---|
| Burst (block) | DMAC seizes the bus for the whole block; CPU stalls on memory | fastest devices (disk) |
| Cycle stealing | DMAC takes one bus cycle at a time between CPU cycles | balanced sharing |
| Transparent | DMAC uses only cycles the CPU doesn't need | zero CPU slowdown, slowest transfer |
Numeric contrast: transferring 4 KB by interrupts at ~20 µs overhead per byte ≈ 82 ms of CPU time; by DMA, ~2 register setups + 1 interrupt ≈ microseconds of CPU time. Cycle-stealing bandwidth math: a device at 1 MB/s stealing one 100 ns memory cycle per byte consumes 10⁶ × 100 ns = 0.1 s of bus time per second = 10% of memory bandwidth.
🎯 Exam Focus
- Compare programmed I/O, interrupt-driven I/O and DMA on CPU involvement, speed and hardware cost.
- Differentiate memory-mapped I/O from isolated I/O.
- Classify interrupts: maskable vs non-maskable, vectored vs non-vectored, trap vs hardware interrupt — one example each.
- Explain daisy-chain priority interrupt logic with the PI/PO signal rules. What are its two main drawbacks?
- Describe the sequence of a DMA transfer, naming the DMAC registers and the BR/BG handshake.
- A disk transfers at 2 MB/s using cycle stealing; each byte steals one 50 ns cycle. What fraction of memory bandwidth does it consume?