A rigorous, implementation-level guide to address translation — from foundational virtual memory concepts through AI/ML accelerator memory systems. Targets systems engineers, OS developers, hardware architects, and ML infrastructure teams.
Chapters 1–9 form a complete foundation covering paging, faults, reclaim, and optimizations.
Chapters 4, 5, 10, 15, 16 cover translation hardware, IOMMUs, and advanced TLB design.
Chapters 11–14 address GPU and accelerator memory, LLM serving, and ML-based optimization.
Chapter 6 covers the full protection model; Chapters 5 and 12 cover device and multi-tenant isolation.
| Processor Family | Key Structures Covered |
|---|---|
| x86-64 (Intel / AMD) | CR3, PML4/PDPT/PD/PT, PCID, INVPCID, INVLPG, KPTI, SGX, VT-d, EPT, AMD NPT |
| ARM64 (ARMv8 / v9) | TTBR0/TTBR1_EL1, ASID, TLBI, TrustZone, Stage-2 (IPA→PA), SMMUv3 |
| RISC-V | satp, Sv39/Sv48/Sv57, ASID, SFENCE.VMA, VMID in hgatp, G-stage translation |
| GPU / AI Accelerators | NVIDIA UVM, NVLink/NVSwitch peer-to-peer, TPU HBM, Intel Gaudi2, PagedAttention |
Each chapter is a self-contained HTML file with:
Open any chapter directly in a browser, print to PDF, or host as GitHub Pages.
Content cites peer-reviewed literature, processor architecture manuals, and production system papers:
Speculative claims about proprietary implementations are avoided.