Classroom Glossary Public page

Week 8: VM I — Stack Arithmetic and Memory Segments

955 words

The assembler you wrote in Week 6 speaks RV32I-Lite instructions. The compiler you will write in Weeks 10-12 speaks VirtusLang source code. Between them is a gap: the compiler cannot emit RV32I-Lite assembly without knowing too much about registers, memory layout, and calling conventions. The solution is a virtual machine — an abstract intermediate language whose instructions operate on a stack. The VM translator bridges the gap: it reads VM bytecode and emits RV32I-Lite assembly, so the compiler can target the VM without caring about the underlying hardware.


Reading

  • Petzold Ch 17 ("Automation," pp. 241-258): The chapter describes how programmers graduated from hand-managing addresses to delegating that work to tools. This week's VM translator is the next step up that ladder — not just address management, but instruction selection. (~18 pages)
  • Petzold Ch 22 ("Operating Systems," pp. 295-320) first visit: Petzold describes how an OS manages a process's stack. The VM's software stack in this week is a direct implementation of what Petzold describes at the conceptual level. (~25 pages, read pp. 295-310 for this week)

Lecture

3 hours. Key arc:

Why a stack machine. A stack machine needs only one implicit operand for most instructions: the top of the stack. add pops two values, adds them, and pushes the result. No register allocation. No operand encoding. The compiler can emit push x, push y, add without knowing anything about the target CPU's register file. The VM translator then decides how to implement add on RV32I-Lite — and RV32I-Lite has eight registers ready for exactly this purpose.

The VM language subset for this week.

// Stack arithmetic
push constant 7    // push the literal value 7
push constant 8
add                // pop two, push their sum
pop local 0        // pop to local[0]
push local 0       // push local[0]

Supported commands this week:

  • push constant N — push the integer N
  • push/pop local N — access the local variable segment
  • push/pop argument N — access the argument segment
  • push/pop static N — access the static variable segment
  • push/pop this N, push/pop that N — access object fields / array elements
  • push/pop temp N — temp segment (indices 0-7, mapped to BRAM offsets)
  • add, sub, neg, eq, gt, lt, and, or, not

Memory segment layout. Each VM function has five addressable segments:

Segment RV32I-Lite mapping Base pointer register
local BRAM region allocated at function entry x18 (s2)
argument Caller's push area, below frame x19 (s3)
this Object base (set by OS before method call) x20 (s4)
that Array base (set by pointer push/pop) x21 (s5)
static Fixed BRAM region, globally addressed Absolute address
temp Fixed slots 0-7 in BRAM Absolute address
pointer Sets this (index 0) or that (index 1) Direct register write

The constant segment is virtual: push constant N emits LI x10, N / ADDI sp, sp, -4 / SW x10, 0(sp). No BRAM read.

Translating a push/pop.

def translate_push(segment, index):
    """Emit RV32I-Lite assembly for: push <segment> <index>"""
    
    if segment == 'constant':
        return [
            f"    LI x10, {index}",      # load constant
            f"    ADDI sp, sp, -4",       # grow stack
            f"    SW x10, 0(sp)",         # push
        ]
    
    base_reg = {
        'local':    'x18',
        'argument': 'x19',
        'this':     'x20',
        'that':     'x21',
    }.get(segment)
    
    if base_reg:
        return [
            f"    LW x10, {index*4}({base_reg})",   # load from segment[index]
            f"    ADDI sp, sp, -4",
            f"    SW x10, 0(sp)",
        ]
    
    if segment == 'temp':
        temp_base = 0x0300   # BRAM offset for temp region
        return [
            f"    LW x10, {temp_base + index*4}(x0)",
            f"    ADDI sp, sp, -4",
            f"    SW x10, 0(sp)",
        ]
    
    if segment == 'static':
        static_base = 0x0400
        return [
            f"    LW x10, {static_base + index*4}(x0)",
            f"    ADDI sp, sp, -4",
            f"    SW x10, 0(sp)",
        ]
    
    raise ValueError(f"Unknown segment: {segment}")

def translate_pop(segment, index):
    """Emit RV32I-Lite assembly for: pop <segment> <index>"""
    
    base_reg = {
        'local':    'x18',
        'argument': 'x19',
        'this':     'x20',
        'that':     'x21',
    }.get(segment)
    
    if base_reg:
        return [
            f"    LW x10, 0(sp)",                    # pop
            f"    ADDI sp, sp, 4",
            f"    SW x10, {index*4}({base_reg})",    # store to segment[index]
        ]
    
    if segment == 'temp':
        temp_base = 0x0300
        return [
            f"    LW x10, 0(sp)",
            f"    ADDI sp, sp, 4",
            f"    SW x10, {temp_base + index*4}(x0)",
        ]
    
    if segment == 'static':
        static_base = 0x0400
        return [
            f"    LW x10, 0(sp)",
            f"    ADDI sp, sp, 4",
            f"    SW x10, {static_base + index*4}(x0)",
        ]
    
    raise ValueError(f"Unknown segment: {segment}")

Translating arithmetic. add pops two operands and pushes their sum:

def translate_arithmetic(command):
    """Emit RV32I-Lite assembly for stack arithmetic."""
    
    if command == 'add':
        return [
            "    LW x10, 0(sp)",     # pop y
            "    ADDI sp, sp, 4",
            "    LW x11, 0(sp)",     # pop x (now top of stack)
            "    ADD x10, x11, x10", # x + y
            "    SW x10, 0(sp)",     # push result (sp unchanged for net pop-pop-push)
        ]
    
    if command == 'sub':
        return [
            "    LW x10, 0(sp)",     # pop y
            "    ADDI sp, sp, 4",
            "    LW x11, 0(sp)",
            "    SUB x10, x11, x10", # x - y
            "    SW x10, 0(sp)",
        ]
    
    if command == 'neg':
        return [
            "    LW x10, 0(sp)",
            "    NEG x10, x10",       # pseudo: SUB x10, x0, x10
            "    SW x10, 0(sp)",
        ]
    
    if command == 'and':
        return [
            "    LW x10, 0(sp)",
            "    ADDI sp, sp, 4",
            "    LW x11, 0(sp)",
            "    AND x10, x11, x10",
            "    SW x10, 0(sp)",
        ]
    
    if command == 'or':
        return [
            "    LW x10, 0(sp)",
            "    ADDI sp, sp, 4",
            "    LW x11, 0(sp)",
            "    OR x10, x11, x10",
            "    SW x10, 0(sp)",
        ]
    
    if command == 'not':
        return [
            "    LW x10, 0(sp)",
            "    XORI x10, x10, -1",  # bitwise NOT
            "    SW x10, 0(sp)",
        ]
    
    raise ValueError(f"Unknown arithmetic command: {command}")

Comparison commands. eq pushes -1 (all ones = true in the VM) if the two popped values are equal, 0 otherwise:

def translate_comparison(command, label_counter):
    """Emit comparison command. Uses branch instructions; requires unique labels."""
    
    true_label  = f"CMP_TRUE_{label_counter}"
    end_label   = f"CMP_END_{label_counter}"
    
    branch_map = {'eq': 'BEQ', 'gt': 'BGT', 'lt': 'BLT'}
    branch = branch_map[command]
    
    return [
        "    LW x10, 0(sp)",            # pop y
        "    ADDI sp, sp, 4",
        "    LW x11, 0(sp)",            # pop x
        f"   {branch} x11, x10, {true_label}",
        "    LI x10, 0",               # false
        f"   JAL x0, {end_label}",
        f"{true_label}:",
        "    LI x10, -1",              # true = 0xFFFFFFFF
        f"{end_label}:",
        "    SW x10, 0(sp)",           # push result (sp unchanged: net pop-pop-push)
    ]

The full translator structure.

class VMTranslator:
    def __init__(self, output_filename):
        self.output = []
        self.label_counter = 0
        self.output_filename = output_filename
    
    def translate_file(self, vm_source: str):
        for line in vm_source.splitlines():
            line = line.strip()
            if not line or line.startswith('//'):
                continue
            self.output.append(f"// {line}")   # preserve VM instruction as comment
            parts = line.split()
            cmd = parts[0]
            
            if cmd == 'push':
                self.output.extend(translate_push(parts[1], int(parts[2])))
            elif cmd == 'pop':
                self.output.extend(translate_pop(parts[1], int(parts[2])))
            elif cmd in ('add', 'sub', 'neg', 'and', 'or', 'not'):
                self.output.extend(translate_arithmetic(cmd))
            elif cmd in ('eq', 'gt', 'lt'):
                self.output.extend(translate_comparison(cmd, self.label_counter))
                self.label_counter += 1
            else:
                raise ValueError(f"Unknown VM command: {cmd}")
    
    def write(self):
        with open(self.output_filename, 'w') as f:
            f.write('\n'.join(self.output))

Lab exercises

Four labs in labs/lab-8.md. Plan for ~4 hours.

  • Lab 8.1. Write toolchain/vm-translator/translator.py handling push/pop constant/local/argument/static/temp and all nine arithmetic/comparison commands.
  • Lab 8.2. Translate SimpleAdd.vm (push constant 7, push constant 8, add) and verify the emitted assembly produces 15 in simulation.
  • Lab 8.3. Translate StackTest.vm (a larger arithmetic test covering all 9 commands). Verify against the reference output in tests/vm/StackTest.expected.
  • Lab 8.4. Write two paragraphs in your Toolchain Diary: "What does the VM translator spare the compiler from knowing?" Specific claims required; general answers earn no credit.

Independent practice

  • Read Petzold Ch 17, pp. 241-258. The key passage: Petzold's description of the transition from hand-coded machine language to assembler to higher-level abstractions. The VM translator is the next step he does not name explicitly but describes structurally.
  • Compare the VM segment layout table above against the JVM's frame structure (operand stack, local variables, constant pool). Note two things they share and one structural difference.

Architecture comparison sidebar

Stack model vs flat address model vs register machine.

Py6502v's programs in CSA-102 used a flat address model: variables lived at fixed absolute BRAM addresses known at compile time. There was no stack segment; "local variables" were just named memory locations. This works for a non-recursive, single-threaded program. It fails for any program where two function invocations need independent copies of the same variable.

The VM's stack model solves this: each function invocation gets its own frame, allocated dynamically at the top of the stack. The caller pushes arguments; the callee allocates locals on the same stack. When the callee returns, its frame is popped. Two recursive calls to the same function have independent frames on the stack.

RV32I-Lite has enough registers (x18-x21 as base pointers, x10-x17 as argument/result passing) to implement this stack model without spilling base pointers to memory on every frame entry. The 6502's hardware stack (page 1, 0x0100-0x01FF) is 256 bytes deep — barely enough for one level of function nesting in a complex program. RV32I-Lite's software stack is bounded only by available BRAM.

The x86-64 ABI's red zone is a hybrid: the 128-byte region below rsp is reserved for leaf-function scratch space, skipping the push/pop overhead for functions that never call anything else. RV32I-Lite's eight caller-saved registers (x10-x17) serve the same purpose without a formal red zone.


Reflection prompts

  1. The comparison commands (eq, gt, lt) use unique labels for each comparison site. What would happen if two comparison commands in the same translation unit shared the same label? Write a test case that exposes the bug.
  2. The static segment is mapped to a fixed BRAM region starting at 0x0400. What happens if two different .vm files use static 0? Is this a bug in the translator or a constraint the caller must manage? What does the nand2tetris book say about multi-file static segment naming?
  3. The VM translator emits three instructions for push constant N. A peephole optimizer could recognize when the pushed constant is immediately consumed by an add or sub and fold it into a single ADDI. Write the pattern match for this optimization. (This is a forward promise to CSA-201.)

What's next

Week 9 extends the VM translator with the second half of the VM language: program flow (label, goto, if-goto) and function calls (function, call, return). The function-call protocol is the contract between caller and callee; getting it right is the prerequisite for the compiler in Week 10.