Classroom Glossary Public page

Week 2: Privileged ISA + ECALL Trap

1,638 words

You wrote Virtus OS v1 in a CPU that only had one mode. Every instruction ran at machine level. There was no concept of "this code is the OS" vs "this code is a user program." This week you fix that.


Reading

Required. Petzold, CODE, Ch 22 ("The Operating System"). Petzold traces the origin of the supervisor mode to timesharing mainframes of the 1960s: the IBM 7094, the Compatible Time-Sharing System at MIT. The insight that led to supervisor mode was simple and necessary. When multiple users share a machine, one user's program must not be able to overwrite another's memory, halt the machine, or reconfigure the hardware. The hardware solution was a mode bit: in privileged mode, all instructions are allowed; in user mode, a class of instructions traps to the OS instead. Read Ch 22 as the historical context for the ECALL instruction you will implement this week.

Required. Waterman and Asanovic, RISC-V ISA Manual Volume II: Privileged Architecture, sections 1.1 (privilege levels), 3.1 (machine-level CSRs: mstatus, mtvec, mepc, mcause, mtval, mscratch), 3.1.12 (machine trap handling). Read the trap delivery flow carefully: when an ECALL executes, the CPU saves the current PC to mepc, writes the cause code to mcause (value 8 for ECALL from U-mode, 11 for ECALL from M-mode), jumps to the address in mtvec, and switches to M-mode. MRET reads mepc back into PC and restores the mode.

Required. Bryant and O'Hallaron, CSAPP, Chapter 8, sections 8.1-8.2 (exceptions and system calls). The CSAPP treatment is x86_64-specific but the concepts transfer directly: an exception is a transfer of control to the OS in response to a processor event. Section 8.2 shows how Linux uses syscall (the x86_64 equivalent of ECALL) to gate user-mode access to kernel services.


Lecture: The Trap Mechanism

Why mode separation exists

Your CSA-101 CPU ran everything at machine level. The Virtus OS v1 kernel and the user application were both in the same address space, with no hardware enforcement separating them. A buggy user program could corrupt kernel data. A malicious user program could execute csrw mtvec, <attacker_address> to redirect all traps to attacker code.

The RISC-V privileged architecture adds a mode bit to the CPU state. In M-mode (machine mode), all instructions and CSRs are accessible. In U-mode (user mode), privileged instructions cause an illegal-instruction exception. The transition from U to M happens only through the trap mechanism, giving the OS full control of the interface.

RISC-V defines three privilege levels: M (machine), S (supervisor), and U (user). CSA-201 implements U and M in Modules 2-5. S-mode (the supervisor level used by Linux kernel code) is an optional extension that Module 6 and 7 add when you implement the full Sv32 MMU that requires it.

The six trap CSRs

Six machine-level CSRs control the trap mechanism:

mstatus (machine status register). The most important bits: MPP (bits 12:11) holds the privilege mode that was active when the trap occurred; MIE (bit 3) is the global machine-level interrupt enable. When a trap fires, the CPU clears MIE (disabling further interrupts) and writes the previous mode into MPP.

mtvec (machine trap-vector base-address register). The trap handler address. Two modes: direct (all traps jump to BASE) and vectored (exceptions jump to BASE; interrupts jump to BASE + 4*cause). For CSA-201, use direct mode; set mtvec to the address of your assembly trap handler at boot time.

mepc (machine exception program counter). The CPU writes the address of the instruction that caused the trap here. For ECALL, this is the address of the ECALL instruction itself. MRET loads PC from mepc (and typically adds 4 to advance past the ECALL).

mcause (machine cause register). The interrupt bit (bit 31) plus a 31-bit exception code. Codes relevant to CSA-201: 0 (instruction address misaligned), 2 (illegal instruction), 11 (ECALL from M-mode), 8 (ECALL from U-mode), 12 (instruction page fault), 13 (load page fault), 15 (store page fault).

mtval (machine trap value). For illegal-instruction exceptions, holds the offending instruction word. For memory-access exceptions, holds the faulting address. For ECALL, holds zero.

mscratch (machine scratch register). One register reserved for the trap handler's exclusive use. Typically used to save a user-mode register value temporarily at the top of the trap handler before you can access the kernel stack.

ECALL: The syscall instruction

ECALL is a synchronous, precise exception: it fires immediately when the instruction reaches the execute stage, saving the exact PC. The calling convention for RISC-V system calls (compatible with Linux): a7 holds the syscall number; a0-a5 hold arguments; the return value is written into a0 by the kernel before MRET.

A minimal Virtus OS v2 syscall table for Module 2:

a7 value Syscall Description
64 SYS_WRITE Write to output (a0 = fd, a1 = buf addr, a2 = length)
93 SYS_EXIT Exit the current process (a0 = exit code)

Trap handler structure

The trap handler must save all 32 registers before calling any C code, because any C function might use any register. The standard pattern uses the kernel stack:

.global _trap_handler
_trap_handler:
    addi  sp, sp, -128          # make room for 32 * 4 bytes
    sw    x1,   4(sp)           # save ra
    sw    x2,   8(sp)           # save sp (original value in mscratch)
    sw    x3,  12(sp)           # save gp
    # ... save x4-x31 ...
    sw    x31, 124(sp)

    csrr  a0, mcause            # pass cause to C handler
    csrr  a1, mepc              # pass faulting PC
    call  trap_dispatch         # C function: dispatches by mcause

    # restore all registers
    lw    x1,   4(sp)
    # ... restore x2-x31 ...
    addi  sp, sp, 128

    csrr  t0, mepc
    addi  t0, t0, 4             # advance past ECALL instruction
    csrw  mepc, t0
    mret                        # return to U-mode at mepc

The C trap_dispatch function reads a7 from the saved register file on the stack, looks up the syscall number in a table, calls the handler, and writes the return value into the saved a0 slot before returning.

Architecture Comparison Sidebar: Trap delivery mechanisms

Every architecture that supports OS-managed processes needs a controlled user-to-supervisor transition. The mechanisms differ significantly.

x86_64 SYSCALL/SYSRET. SYSCALL does not use a trap gate; it is a fast-path instruction that reads the target address from the LSTAR MSR and switches to ring 0, saving the user-mode RIP into RCX and RFLAGS into R11. SYSRET reverses. No stack switch happens in hardware; the kernel saves the user stack pointer and switches to the kernel stack in software. The call convention: RAX holds the syscall number; RDI, RSI, RDX, R10, R8, R9 hold arguments.

ARM64 SVC / ERET. SVC #imm (supervisor call) traps from EL0 (user) to EL1 (kernel). The immediate is typically 0 on Linux; the actual syscall number is in x8. ELR_EL1 saves the return address. ERET returns.

RISC-V ECALL / MRET. No immediate operand; the syscall number is in a7 by convention. The mechanism saves mepc (faulting address), mcause (cause code), mstatus.MPP (previous mode). MRET reverses. Cleaner than x86_64's MSR-based scheme; similar to ARM64 but with separate M/S/U levels giving finer control.

Why RISC-V has three levels while Linux uses two. Linux collapses M-mode into firmware (OpenSBI or BBL handles M-mode and jumps to the Linux kernel at S-mode). User code runs at U-mode. CSA-201 runs U/M only because there is no separate OS platform firmware; the Virtus OS v2 kernel IS the M-mode code. Modules 6-7 will add S-mode delegation.


Lab exercises

See labs/lab-2-ecall-trap.md for the full specification.

Lab 2.1: First user-to-supervisor transition. Implement privilege.v and trap_ctrl.v in your CSA-201 CPU branch. Write a minimal trap handler in assembly that:

  1. Saves all 32 registers to the kernel stack.
  2. Reads mcause and dispatches SYS_WRITE (output the string from the user buffer) and SYS_EXIT (halt the simulation).
  3. Returns to the user program via MRET.

Write a user-mode program in assembly that calls SYS_WRITE twice (output "hello\n" and "world\n") then calls SYS_EXIT.

Run the combined kernel + user program under Verilator. Use SignalTap (or the Verilator waveform dump) to verify: the mode bit transitions U->M on ECALL and M->U on MRET. Record the cycle count for one complete round-trip (ECALL to MRET return to user code). This is your trap overhead baseline for Module 11's scheduler design.

Run the rv32mi-p-* riscv-tests suite against your Verilator sim. All machine-level trap tests must pass.


Independent practice

  1. Read Petzold Ch 22 as assigned. Identify the paragraph where Petzold describes the "system call" concept and its historical origin. Write a one-paragraph Toolchain Diary entry tracing the path from Petzold's timesharing mainframes to your ECALL implementation.

  2. Read the mstatus register description in RISC-V Privileged Architecture Volume II section 3.1.6. Draw the bit field layout for mstatus. Mark which bits your trap handler reads and writes. Why does the CPU automatically clear mstatus.MIE on trap entry? What would go wrong if it did not?

  3. Write a table comparing the register save/restore sequence for x86_64 SYSCALL (Linux calling convention), ARM64 SVC (Linux), and RISC-V ECALL (Linux/OpenSBI convention). How many registers does each convention require saving in the kernel entry point?

  4. Toolchain Diary entry: QEMU. Boot the riscv32-virt machine with your trap-handler binary using qemu-system-riscv32 -machine virt -nographic -bios none -kernel <your-binary>. Record the QEMU version, the command line, and what you observe. Note any differences between QEMU's behavior and your Verilator simulation. (The virt machine has different device memory maps; do not worry about matching QEMU hardware exactly. The goal is to see your trap code run on a second simulator.)


Reflection prompts

  1. The mscratch register exists so the trap handler can use it as a temporary before the kernel stack is accessible. Why does the trap handler need a temporary register before it has saved any registers? Trace through the first three instructions of the trap handler and explain what would go wrong if mscratch did not exist.

  2. MRET advances to mepc + 4 (past the ECALL) rather than returning to mepc (the ECALL itself). Why? What would happen if the trap handler returned to mepc without advancing?

  3. The RISC-V privileged spec defines mtvec modes: direct (all traps to BASE) and vectored (interrupts to BASE + 4*cause). CSA-201 uses direct mode. When would vectored mode be preferable? What does the kernel lose by using vectored mode?


What's next

Module 3 switches from hardware to the compiler. The trap mechanism you built this week provides the ECALL interface your compiler can now emit calls through. Module 3 adds a register allocator pass to your CSA-101 compiler: it assigns the 32 available registers to intermediate variables instead of always spilling to the stack. Lab 3.1 measures the emit-size reduction against the Lab 7.4 forward-promise from CSA-101.