Add a layer to the toolchain. A stack-based virtual machine. You write the VM translator that consumes VM bytecode and emits RV32I-Lite assembly. By end of week your translator handles stack arithmetic plus the four memory segments (local, argument, this, that).
Reading
- Chapter prose (primary). draft-chapters/ch7-vm-i-prose.md
- Petzold weave anchors. Ch 17 Automation (returning visit, pp. 209 + 212); Ch 22 The Operating System p. 328 (returning visit, bootstrap loader); Ch 24 Languages High and Low p. 354 (returning visit, "ALGOL... seminal language, the direct ancestor..."). All three are returning visits; the threads start to weave together
- Cross-chapter handouts. VM segment cheat sheet
Lecture
lectures/ch7-vm-i-lecture.md. 3 hours. Key arc:
- Why a VM. The Jack-equivalent language (Ch 9-11) is easier to compile to a stack machine than to RV32I-Lite directly. The VM is the intermediate layer
- Stack arithmetic.
push 3; push 4; addleaves 7 on top of the stack. The VM operations are simple; the translator turns them into RV32I-Lite instructions that manipulate the stack pointer - Memory segments. The VM exposes named regions (local, argument, this, that, static, constant, pointer, temp). The translator maps each named region to a base-address-plus-offset pattern in RV32I-Lite
- The translator is a one-pass pattern matcher. Each VM op has a fixed RV32I-Lite expansion
Figure 8.1. The same push 5; push 7; add walked across four panels of stack state. The amber cell in each panel is the one your translator's emitted instructions just wrote to. The sp marker ascends from 0x00010030 to 0x00010038 and back down to 0x00010034: each push writes then sp += 4, the add pops twice and pushes once (net sp - 4). Per cross-chapter-vm-segment-cheat-sheet.md, sp always points at the next-free slot, one past the topmost occupied word. Reuse this picture when Lab 7.1 first asks you to predict the SP-value column.
Lab exercises
Five labs in worksheets/ch7/.
- lab-7.5-vm-stack-machine.md, calibrate against the VM Stack Machine simulator (Tier-1 companion; pure-browser; the 7.5 numbering slots before 7.1-7.4 as the recommended primer)
- lab-7.1-stack-arithmetic-translator.md (Tier-2)
- lab-7.2-memory-segment-translator.md (Tier-2)
- lab-7.3-end-to-end-on-student-silicon.md (Tier-2)
- lab-7.4-hand-vs-translator-reconciliation.md (Tier-2)
Plan for ~6 hours of lab (the simulator companion adds ~75 minutes on top of the original budget).
Independent practice
- Re-read Petzold Ch 17 + Ch 22 + Ch 24 for the returning-visit theses. Notice that the three chapters are starting to interlock in your mental model the way they do in Petzold's book
- Update your Toolchain Diary. Week 8 introduces: VM bytecode notation, stack-pointer arithmetic, the segment-base-plus-offset addressing pattern, the academy's
vm-translatorPython module
Where the segments live in the memory map
Ch 3's byte-addressable RAM had no regions. Ch 7's VM hands you eight named segments, four of which (LCL, ARG, THIS, THAT) are pointers stored at fixed absolute addresses, four of which (temp) ARE fixed slots, and the other four (static, constant, pointer) live in .data or are inlined by the translator.
Figure 8.2. Same memory-map strip you saw in Ch 3 §Where this RAM lives. The amber block is now the focus: by end of Ch 7 your translator emits code that reads LCL_addr from 0x00010000 and uses it as the base for local 0, local 1, etc. The eight temp slots at 0x00010010..0x0001002C are direct (no indirection); the four pointer slots above them are the indirection step. Pin this picture during Lab 7.2.
Architecture comparison sidebar
Stack-based VMs (JVM, Python bytecode, WebAssembly) are easier to compile to but slower to execute than register-based VMs (Dalvik on Android, LLVM IR). The trade-off: stack VMs have shorter bytecode (no register operand fields); register VMs have fewer instructions per high-level operation. CSA-101 uses stack-based because it pairs cleanly with the recursive-descent compiler in Ch 9-10.
Reflection prompts
- The VM adds a layer to the toolchain. Why is that a good thing? When would adding more layers make the toolchain worse?
- Stack operations require no register names in the bytecode. Why doesn't every VM use a stack architecture?
- The four memory segments (local, argument, this, that) are named after their pedagogical role. What would these be called in a production language runtime?
What's next
Week 9 finishes the VM. Program flow (labels and conditional jumps) plus function calls (with the full calling-convention protocol). After this week the VM can express any computable program; it is the first time the toolchain is Turing-complete from the source-language side.