Classroom Glossary Public page

CSA-110 Instructor Guide

3,692 words

For instructors and homeschool parents running CSA-110 with one or more students. CSA-110 assumes CSA-101 + CSA-102 graduates: students who have personally built a 6502 on a Tang Primer 25K and written a compiler toolchain that targets it. This guide names the pacing risks, the common stumbles, and how the 6502 background changes the difficulty curve.


Course shape at a glance

Item Value
Total time ~155 hours over 14 weeks
Weekly time ~11 hours student time
Lecture per week ~3 hours
Lab per week ~5 hours
Reading + Toolchain Diary ~3 hours
Audience CSA-101 + CSA-102 graduates
Hardware Tang Primer 25K carry-forward from CSA-101 (no new board)
Cost $0 (board already owned from CSA-101)
Capstone Working Virtus Console on RV32I-Lite

How CSA-110 differs from CSA-101

CSA-101 was the students' first time building a CPU, a toolchain, and an OS. CSA-110 is their second time. The architectural decisions are different (RISC vs 6502); the execution experience is familiar. Consequences:

  • Weeks 1-5 move faster. HDL workflow, FPGA synthesis flow, simulation loop — already known. Students can focus on the RV32I-Lite specifics instead of fighting tools. Budget the saved time in Weeks 10-13.
  • Architecture Comparison Sidebars are immediate. Every sidebar compares to the 6502 the student literally built. "The 6502 had 3 registers and variable-length encoding; RV32I-Lite has 8 registers and fixed 32-bit encoding" is not historical trivia — it is a difference the student can feel in the lab.
  • Forward references are a genuine surprise. The 6502's flat-address model let Py6502v assemble in one pass on most programs. The B-type offset encoding in RV32I-Lite requires two passes for forward references. This is the most pedagogically rich surprise in Weeks 1-7; do not resolve it early.
  • The write-up comparison (Section 4) is the most distinctive deliverable. A student who has built both stacks can write something no other student can. Coach them toward specificity.

Pacing risks per week

Week Risk Mitigation
1 Student tries to port CSA-101 Verilog directly Redirect: start from blank module; the NAND-primitive constraint is pedagogical
2 BCD sidebar feels academic Have the student write a BCD-addition test in 6502 assembly (from CSA-101 memory) and contrast it with RV32I-Lite's absence of BCD
3 x0 hardwired-zero is confusing for students who remember the 6502 zero page Name the distinction explicitly: "zero page" is an address; "x0" is a register hardwired to zero
4 B-type split-immediate encoding: students encode it wrong the first time This is expected and intended. The round-trip against riscv64-linux-gnu-as catches every error; the discrepancy is the lesson
5 Highest-friction week again (same as CSA-101). Silicon bring-up with a new bitstream Students know the FPGA workflow; the new decoder.v is the variable. Let them debug $display output before flashing
6 Why two passes? Students know one-pass worked in Py6502v Surface the forward-reference edge case explicitly before lab; the surprise is better as a demonstration than as a surprise in the lab
7 Symbol resolution feels like repeating Week 6 The distinction: assembler resolves intra-file; linker resolves inter-file. The math-stub.vof exercise makes this concrete
8-9 VM translation: students wrote a translator in CSA-102 The Py6502v VM targeted 6502 registers; the RV32I-Lite VM uses the stack. The comparison in the Toolchain Diary is the payoff
10-11 Recursive descent feels familiar from CSA-102 It should; the structure is the same. Push harder on the RV32I-Lite calling convention and the three-register format's effect on codegen
12 Multi-file compilation + OS-library calls Provide a known-good 3-file test program; the student's job is to verify reproduction and note differences from CSA-102's flat model
13 Virtus OS services: students may try to copy from CSA-101 Permitted if clearly attributed; the graded deliverable is the Toolchain Diary entry noting what changed
14 Capstone anxiety, but less than CSA-101 first-timers The student has done this before. Remind them that the interesting deliverable is the CSA-101 vs CSA-110 comparison, not the program complexity

Common student stumbles

Weeks 1-3

  • "I'll just use my CSA-101 Verilog." Allowed for Week 1-3 reference; not allowed as the submission. Students who copy without understanding fail the metastability and simultaneous-read-write tests in Week 3.
  • x0 confusion. The 6502's accumulator (A) and index registers (X, Y) are named; x0 through x7 are numbered. Students sometimes try to use x0 as a general register. The hardwired-zero property trips them.

Weeks 4-5

  • B-type immediate mismatch. This is the most common lab failure in Week 4. The bit rearrangement (imm[12|10:5] in the high half, imm[4:1|11] in the low half) is counterintuitive. The round-trip against riscv64-linux-gnu-as is the prescribed fix; do not skip it.
  • Decoder for 11 instruction types. Students who write case (opcode) without also switching on funct3 miss the ADD/SUB distinction (both have opcode 0110011; funct7[5] distinguishes them). The seeded-failure drill in Lab 5.5 catches this.

Weeks 6-7

  • Forward reference edge case. Students who write their test programs without forward references won't discover the two-pass requirement. Require at least one test with a label used before it is defined (the test-forward-ref.s requirement in Lab 6.1).
  • Linker vs assembler responsibility. Students sometimes add relocation support to the assembler to avoid needing a linker. This is architecturally wrong; redirect them: the assembler produces relocatable objects; the linker fixes them.

Weeks 8-11

  • VM calling convention. The stack-based calling convention (push arguments, call, push return value, pop) is different from 6502's JSR + stack-pull. Students who muscle-memory the 6502 pattern will write a translator that works for simple calls but fails on recursive programs (Lab 9.4).
  • Codegen symbol-table scope. Class-level fields vs local variables. Same bug as CSA-101 Week 10-11; same fix (Lab 10.5 Ghidra comparison). If you ran CSA-101, you know where to look.

Weeks 12-13

  • Virtus OS compliance failures. The compliance suite is identical to CSA-101's. Students who did CSA-101 have an advantage here; watch for students who copy their CSA-101 OS implementation verbatim without adapting it to RV32I-Lite register conventions.
  • VCP integration. Same peripherals as CSA-101; same integration issues. The Peripheral IP Pack documentation covers both the 6502 and RV32I-Lite sides.

Toolchain Diary continuity

The Toolchain Diary pattern starts in Week 4 and runs through Week 14. By Week 14 the student should have ~30 entries. Many of these are CSA-102 tools revisited for RV32I-Lite; each such entry should note both the CSA-102 and CSA-110 encounter and record the difference.

Pattern per entry: tool name; what it does; when you used it; one surprise; how it compares to the CSA-102 encounter (if applicable). ~3-5 sentences per entry.

New tools that first appear in CSA-110 (not in CSA-101 or CSA-102):

  • riscv64-linux-gnu-as — the GNU RISC-V assembler; used for round-trip verification
  • riscv64-linux-gnu-objdump — disassembler; used against the student's own assembler output
  • Ghidra with RISC-V:LE:32:RV32I processor module — RISC-V disassembly
  • iverilog and vvp (carry-forward; entry notes CSA-110-specific usage patterns)

Grading rubric for the capstone

Three dimensions equally weighted (per CAPSTONE.md):

  1. The console works (33%). Bitstream flashes; capstone program boots; demo video shows real output.
  2. The toolchain reproduces (33%). A grader can rebuild program-built/program.bin byte-for-byte from toolchain/ + program/; the SHA-256 matches.
  3. The write-up is honest (33%). Five sections show actual reflection; Section 2 names real gaps; Sections 4-5 show specificity about the RV32I-Lite vs 6502 experience.

Use the capstone rubric template at worksheets/TEMPLATE-capstone-rubric.md. The CSA-110-specific addition: a fourth rubric dimension worth up to 10 bonus points for students who produce an exceptionally specific CSA-101-vs-CSA-110 comparison (Section 4 write-up).


When students get stuck

Behind by Week 6 (assembler): Almost always a forward-reference confusion or a B-type encoding error. The round-trip against riscv64-linux-gnu-as is the diagnostic; run it early.

Behind by Week 10 (compiler): The recursive-descent structure should be familiar from CSA-102. If not, have the student re-read their CSA-102 tokenizer. The structures are isomorphic; the difference is the code-generation target.

Behind by Week 12: Offer to extend the capstone delivery. The Virtus OS integration work is the same as CSA-101 Week 13; CSA-101 graduates who get to Week 12 rarely fail to finish.


Forward-pointers

  • CSA-201. Full RV32I; M extension; privileged ISA; MMU; PMP; register allocator; peephole. Direct continuation; CSA-110 Week 14 is CSA-201 Week 1 prerequisites.
  • CON-101. Virtus Console as game-dev platform. Uses CSA-110's Virtus OS as the target OS.
  • RE-101. SB6141 reverse engineering. Uses CSA-110's Week 4 encoding fluency and Week 5 silicon instinct; the 32-bit instruction format makes RISC-V binaries easier to disassemble by hand than 6502 binaries.

Logistical notes

  • No new hardware required. Students carry their Tang Primer 25K forward from CSA-101. Emphasize this at enrollment — it reduces the friction of signing up.
  • Cohort pairing with CSA-101 cohorts. Because CSA-110 covers the same hardware (Tang Primer 25K) and the same OS destination (Virtus Console), it can share office-hours slots with CSA-101 in Weeks 1-7. After Week 7 the toolchain topics diverge.
  • Capstone grading turnaround. 10 days from submission.

Week 8: VM I — Stack Arithmetic

Opening hook. Start with a demonstration: write push constant 7; push constant 8; add on the board. Ask: "Where does the result live?" Students who did CSA-102 will say a named memory location. Correct them: it lives at the top of a hardware stack — a pointer-managed region of BRAM.

Pacing table.

Segment Time What to cover
Stack model + segment map 40 min Draw the stack diagram; name the six segment bases and their register anchors
Push/pop translation 50 min Walk through translate_push('constant', 7) line by line; build translate_pop together
Arithmetic and comparison 40 min Arithmetic: straightforward; comparison: true=-1 encoding needs explanation
Lab setup 10 min Confirm student can run translator.py SimpleAdd.vm before leaving

Three common issues.

  1. Comparison true = -1, not 1. Students who expect true = 1 will pass simple tests but fail when the result of a comparison is used in and/or. Explain: -1 is all bits set; this makes and and or work on boolean results without special cases.
  2. Temp segment absolute address. Students sometimes use an offset from x18 for temp. Redirect: temp is at absolute BRAM address 0x0300; it is not relative to any base pointer.
  3. Label counter for comparisons. Students who forget the label counter will have label collisions in programs with multiple comparisons. The symptom: the second eq instruction jumps to the label from the first eq. Require the seeded-error drill.

Petzold weave. Ch 17, "Automation," pp. 223-232. The VM is the automation layer: it lets the compiler write push constant 7 without knowing which register or memory address will hold the value. This is exactly Petzold's argument for why each abstraction layer is worth its complexity cost.

Lab 8 timing note. Lab 8.2 (SimpleAdd) takes 30-45 minutes. Lab 8.3 (StackTest) takes 60-90 minutes for students who encounter the comparison-encoding issue. Budget 30 minutes of buffer for that issue.


Week 9: VM II — Function Calls

Opening hook. Ask the student to trace what happens, instruction by instruction, when factorial(1) calls itself. Most students can answer for one level of recursion; almost none can answer correctly for two levels without drawing the stack. Draw it. The moment the student sees that each frame is independent is the moment the calling convention makes sense.

Pacing table.

Segment Time What to cover
Program flow (label/goto/if-goto) 30 min Mechanical; students often get this in one pass
The frame layout diagram 40 min Draw the full frame; label every slot; ask the student to label the second frame when factorial calls itself
translate_call / translate_function 50 min Code walkthrough; emphasize ARG repositioning formula derivation
translate_return 40 min Why x18 restore must be last; trace through the restore sequence

Three common issues.

  1. ARG repositioning off-by-one. The formula is sp + (5 + n_args) * 4. Students who write sp + 5 * 4 or sp + n_args * 4 will produce wrong ARGs for any non-trivial call. NestedCall.vm (Lab 9.3) catches this reliably.
  2. Return address saved from the wrong location. The return address is 5 slots above x18 (LCL), not 1 slot. Students who read the lecture pseudocode without tracing the frame diagram get this wrong. Require the frame layout drawing in the Toolchain Diary.
  3. x18 restored first instead of last. If x18 is restored before x19-x21, the restore of x21, x20, x19 reads from the new (callee's) x18 instead of the caller's frame. The symptom: THIS and THAT have garbage values after return. Ask the student to hand-trace translate_return with x18 restored in each possible order.

Petzold weave. Ch 25, "The World Brain," pp. 361-378. The calling convention you implement this week is the lowest rung of the abstraction ladder Petzold describes. Read the passage aloud with the student: "Each layer takes for granted the services of the layer below." The VM calling convention is what the compiler takes for granted.

Lab 9 timing note. Lab 9.4 (recursive factorial) is the capstone of the week. Allow 90 minutes. Students who get the ARG repositioning wrong will get incorrect output for factorial(5) — specifically, a value that is not 120 and not obviously wrong (it may be 15 or 30, depending on the bug). The diagnostic is always the frame-layout drawing.


Week 10: Compiler I — Syntax Analysis

Opening hook. Show the student let x = a + b * c; and ask: "What does the compiler see?" Then show them what the tokenizer produces (a flat list of tokens) and what the parser produces (a tree). The jump from flat list to tree is the entire content of the week.

Pacing table.

Segment Time What to cover
Comment stripping + character classification 30 min Walk through _strip_comments regex; name the four character classes
The grammar: class/subroutine/statement 60 min Write the grammar on the board; circle the recursive rules
Recursive descent: one method per rule 50 min Live-code parse_class and parse_subroutine_dec together
Expression/term mutual recursion 40 min This is the hardest part; slow down; draw the call graph

Three common issues.

  1. Off-by-one in expression parsing. The expression rule is term (op term)*. Students who write while self.tok.peek().value in ops without checking has_more() will crash on end-of-file. Require the seeded-error drill (Lab 10.4).
  2. String literals not handled. Students skip string literals in the tokenizer because "Square.vl doesn't have any strings." Then Lab 11.3 (Counter.vl with Output.printInt) fails. Require a string-containing test early.
  3. Mutual recursion surprises. parse_expression calls parse_term, which calls parse_expression for parenthesized expressions. Students who are not expecting this are surprised when a deeply-nested expression causes a Python recursion limit. Not a real issue for the programs in this course; mention it and move on.

Petzold weave. Ch 24, "Languages High and Low," pp. 337-360. The tokenizer and parser are the first two steps of the translation Petzold describes. The student is now implementing the machinery Petzold describes abstractly.

Lab 10 timing note. Lab 10.3 (parse Square.vl and diff against reference) is the gating exercise. Expect 60-90 minutes. Every student discovers at least one grammar rule they misread the first time; the diff is the diagnostic.


Week 11: Compiler II — Code Generation

Opening hook. Ask the student to write, by hand, the VM bytecode for let x = a + b; where x is a local variable and a, b are instance fields. Most students get the push/pop direction wrong for at least one variable. Correct them using the symbol table: the kind determines the segment; the index determines the offset.

Pacing table.

Segment Time What to cover
SymbolTable: two scopes, define, lookup 40 min Live-code the two-dict structure; demonstrate shadow behavior
CodeGen: compile_class and compile_subroutine 50 min Constructor/method preamble difference is the key point
compile_let / compile_if / compile_while 50 min If and while use the same negated-condition branch pattern
Subroutine calls: method vs function vs constructor 40 min The push pointer 0 before method calls; the call Memory.alloc 1 for constructors

Three common issues.

  1. Symbol-table scope bug. A local variable with the same name as a class field shadows the field correctly at lookup, but students often define them in the wrong scope (define a local in the class scope, or define a field in the subroutine scope). The symptom: this.x and x compile to the same segment. Ghidra (Lab 11.5) is the prescribed diagnostic.
  2. Void return value. Methods that return void still need a push constant 0; return in the VM. Students who omit the push leave garbage on the stack, which the caller then discards with pop temp 0. Correct in isolation; catastrophic in composition.
  3. Constructor preamble omitted. Students who forget push constant n_fields; call Memory.alloc 1; pop pointer 0 produce constructors that write this fields to address 0. The symptom: a segfault-equivalent (writing to BRAM address 0).

Petzold weave. Ch 22, "The Operating System," pp. 295-332. The code generator emits calls to Memory.alloc for constructors — calls to the OS that the OS has not been written yet. This is the moment the student understands why the OS and the compiler are designed together.

Lab 11 timing note. Lab 11.3 (Counter.vl end-to-end) is the first time the full stack runs: VirtusLang source → VM → assembly → simulation → correct output. Allow 90-120 minutes for students who need to debug the symbol table. Lab 11.5 (Ghidra) requires 30-45 minutes of setup if the student has not used Ghidra before.


Week 12: Compiler III — OS-Aware Compilation

Opening hook. Ask: "What happens when your compiler emits call Output.printInt 1 and there is no Output.printInt in the compiled VM?" Walk through the resolution chain: compiler emits the symbol; VM translator leaves it unresolved; assembler records it in the .reloc table; linker resolves it against the Virtus OS stdlib. This is the same chain that produces libc.so.6 in a Linux binary.

Pacing table.

Segment Time What to cover
Multi-file compilation: driver pattern 30 min compiler.py with glob; why file order matters (or doesn't)
OS call signatures and argument counting 40 min OS_SIGNATURES dict; trace a Output.printInt(x) call through the compiler
End-to-end pipeline: 3 files → binary 50 min Walk through the shell script; emphasize each stage's role
HDMI vs UART output path 20 min HDMI: framebuffer write; UART: character-art escape sequence

Three common issues.

  1. Compiler.py hardcoded paths. Students who use ../virtus-os/ as a hardcoded path will fail the Tier 1 gate if the grader runs from a different directory. Require path arguments from the command line.
  2. Wrong argument count for OS calls. A student who miscounts arguments for Screen.drawRectangle (4 arguments) will produce a VM call with the wrong n_args. The linker resolves the symbol; the VM calling convention passes the wrong frame size; the result is a corrupted stack. This is a hard bug to find without the OS_SIGNATURES dict.
  3. Object file order sensitivity. The linker places sections in command-line order. If the student's program assumes Main.vof is linked before Ball.vof, a different order will produce a wrong binary. Require the student to sort input files explicitly.

Petzold weave. Ch 25, "The World Brain," pp. 361-378, second visit. The complete abstraction ladder is now visible: source → tokens → tree → VM → assembly → binary → silicon. Petzold's hierarchy from relay to transistor to gate to chip to CPU to OS to language maps exactly onto what the student has built.

Lab 12 timing note. Lab 12.2 (compile and link Pong) is the validation exercise. Expect 60-90 minutes. Most time is spent debugging one stage of the pipeline; the diagnostic is to run each stage independently and verify its output before proceeding to the next.


Week 13: Virtus OS

Opening hook. Ask: "Every call Output.printInt your compiler has emitted for three weeks — who implements printInt?" The answer is: you do, this week.

Pacing table.

Segment Time What to cover
Math.multiply (shift-and-add) 40 min Trace through twoToThe table; explain the O(log N) vs O(N) complexity
Memory.alloc (first-fit free list) 50 min Draw the free-list structure; trace alloc, then deAlloc, then alloc again
Output (char bitmap, cursor) 30 min The font table + printChar are the whole story
Screen, Keyboard, GamePad, Sys 40 min These are simpler; spend more time on the VCP integration setup
Compliance suite 20 min Walk through the suite command and expected output

Three common issues.

  1. Math.divide with x < 0. The recursive divide implementation in the lecture handles positive x only. Students who do not handle the sign need to wrap in Math.abs guards. The compliance suite tests Math.divide(-10, 3).
  2. Memory.alloc on a full heap. Students who do not call Sys.error(6) on allocation failure will return -1, which the caller will use as a valid pointer. The compliance suite tests this by exhausting the heap.
  3. Virtus OS copied verbatim from CSA-101 without RV32I-Lite adaptation. Permitted if clearly attributed. The most common adaptation failure: twoToThe[15] must be 16384 (not 32768) because 32768 overflows a 16-bit signed integer. Students who used a 16-bit target in CSA-101 already know this; students who used a wider integer may not.

Petzold weave. Ch 22, "The Operating System," pp. 295-332, final visit. Read the Petzold passage on memory management with the student. Then show the Memory.alloc free-list diagram side by side with Petzold's description. The Virtus OS is a complete implementation of the OS that Petzold describes abstractly.

Lab 13 timing note. Lab 13.1 (Math.vl + Memory.vl compliance) is the gate: do not let the student proceed to Lab 13.2 until 13/13 pass. The compliance suite is fast (< 2 minutes per run); students can iterate quickly. Lab 13.4 (VCP integration) requires the hardware; schedule hardware office hours before the lab due date.


Week 14: Capstone

Opening hook. No lecture needed. This is the week the student ships what they built. Begin with one question: "Does your CPU produce sum-to-N = 55 in simulation?" If the answer is no, address it before anything else.

Pacing table.

Day Instructor check-in What to ask
1 CPU health "Sum-to-N = 55 in simulation?"
3 Tier 1 gate dry run "Run it from /tmp and paste me the sha256sum output"
5 Write-up Section 4 "Read me your first CSA-101 comparison point"
7 Final submission "Unzip it in a clean directory and run the gate"

Three common issues.

  1. Tier 1 gate SHA-256 mismatch due to non-deterministic output. Assembler or linker that processes sets instead of lists may produce different output on different runs. Require deterministic ordering everywhere.
  2. Demo video shows only a terminal with no program output. Redirect the student to record actual hardware output (UART or HDMI) rather than a terminal showing a build command.
  3. Section 4 write-up is generic. "RV32I-Lite is RISC" is not a comparison. Ask the student to cite a specific program, a specific instruction count, or a specific failure mode. If they cannot, have them open their Toolchain Diary and read their Week 4 or Week 9 entries — the specific observations are there.

Capstone Tier 1 gate check-in protocol.

Per student, in the last office hours before submission close:

  1. Have the student unzip their submission into /tmp/{name}-check/
  2. Run the gate command from toolchain/:
    python3 compiler/compiler.py ../program/Main.vl -o /tmp/main.vm && \
    python3 vm-translator/translator.py /tmp/main.vm -o /tmp/main.s && \
    python3 assembler/asm.py /tmp/main.s -o /tmp/main.vof && \
    python3 linker/linker.py /tmp/main.vof virtus-os/*.vof -o /tmp/prog.bin && \
    sha256sum /tmp/prog.bin
    
  3. Compare to program-built/program.bin.sha256
  4. If mismatch: ask the student to run diff <(sha256sum program-built/program.bin) <(sha256sum /tmp/prog.bin) and trace the first divergent byte

Instructor guide v0.2. Full Week 8-14 notes added 2026-05-30.