For instructors and homeschool parents running CSA-110 with one or more students. CSA-110 assumes CSA-101 + CSA-102 graduates: students who have personally built a 6502 on a Tang Primer 25K and written a compiler toolchain that targets it. This guide names the pacing risks, the common stumbles, and how the 6502 background changes the difficulty curve.
Course shape at a glance
| Item | Value |
|---|---|
| Total time | ~155 hours over 14 weeks |
| Weekly time | ~11 hours student time |
| Lecture per week | ~3 hours |
| Lab per week | ~5 hours |
| Reading + Toolchain Diary | ~3 hours |
| Audience | CSA-101 + CSA-102 graduates |
| Hardware | Tang Primer 25K carry-forward from CSA-101 (no new board) |
| Cost | $0 (board already owned from CSA-101) |
| Capstone | Working Virtus Console on RV32I-Lite |
How CSA-110 differs from CSA-101
CSA-101 was the students' first time building a CPU, a toolchain, and an OS. CSA-110 is their second time. The architectural decisions are different (RISC vs 6502); the execution experience is familiar. Consequences:
- Weeks 1-5 move faster. HDL workflow, FPGA synthesis flow, simulation loop — already known. Students can focus on the RV32I-Lite specifics instead of fighting tools. Budget the saved time in Weeks 10-13.
- Architecture Comparison Sidebars are immediate. Every sidebar compares to the 6502 the student literally built. "The 6502 had 3 registers and variable-length encoding; RV32I-Lite has 8 registers and fixed 32-bit encoding" is not historical trivia — it is a difference the student can feel in the lab.
- Forward references are a genuine surprise. The 6502's flat-address model let Py6502v assemble in one pass on most programs. The B-type offset encoding in RV32I-Lite requires two passes for forward references. This is the most pedagogically rich surprise in Weeks 1-7; do not resolve it early.
- The write-up comparison (Section 4) is the most distinctive deliverable. A student who has built both stacks can write something no other student can. Coach them toward specificity.
Pacing risks per week
| Week | Risk | Mitigation |
|---|---|---|
| 1 | Student tries to port CSA-101 Verilog directly | Redirect: start from blank module; the NAND-primitive constraint is pedagogical |
| 2 | BCD sidebar feels academic | Have the student write a BCD-addition test in 6502 assembly (from CSA-101 memory) and contrast it with RV32I-Lite's absence of BCD |
| 3 | x0 hardwired-zero is confusing for students who remember the 6502 zero page | Name the distinction explicitly: "zero page" is an address; "x0" is a register hardwired to zero |
| 4 | B-type split-immediate encoding: students encode it wrong the first time | This is expected and intended. The round-trip against riscv64-linux-gnu-as catches every error; the discrepancy is the lesson |
| 5 | Highest-friction week again (same as CSA-101). Silicon bring-up with a new bitstream | Students know the FPGA workflow; the new decoder.v is the variable. Let them debug $display output before flashing |
| 6 | Why two passes? Students know one-pass worked in Py6502v | Surface the forward-reference edge case explicitly before lab; the surprise is better as a demonstration than as a surprise in the lab |
| 7 | Symbol resolution feels like repeating Week 6 | The distinction: assembler resolves intra-file; linker resolves inter-file. The math-stub.vof exercise makes this concrete |
| 8-9 | VM translation: students wrote a translator in CSA-102 | The Py6502v VM targeted 6502 registers; the RV32I-Lite VM uses the stack. The comparison in the Toolchain Diary is the payoff |
| 10-11 | Recursive descent feels familiar from CSA-102 | It should; the structure is the same. Push harder on the RV32I-Lite calling convention and the three-register format's effect on codegen |
| 12 | Multi-file compilation + OS-library calls | Provide a known-good 3-file test program; the student's job is to verify reproduction and note differences from CSA-102's flat model |
| 13 | Virtus OS services: students may try to copy from CSA-101 | Permitted if clearly attributed; the graded deliverable is the Toolchain Diary entry noting what changed |
| 14 | Capstone anxiety, but less than CSA-101 first-timers | The student has done this before. Remind them that the interesting deliverable is the CSA-101 vs CSA-110 comparison, not the program complexity |
Common student stumbles
Weeks 1-3
- "I'll just use my CSA-101 Verilog." Allowed for Week 1-3 reference; not allowed as the submission. Students who copy without understanding fail the metastability and simultaneous-read-write tests in Week 3.
- x0 confusion. The 6502's accumulator (A) and index registers (X, Y) are named; x0 through x7 are numbered. Students sometimes try to use x0 as a general register. The hardwired-zero property trips them.
Weeks 4-5
- B-type immediate mismatch. This is the most common lab failure in Week 4. The bit rearrangement (imm[12|10:5] in the high half, imm[4:1|11] in the low half) is counterintuitive. The round-trip against
riscv64-linux-gnu-asis the prescribed fix; do not skip it. - Decoder for 11 instruction types. Students who write
case (opcode)without also switching onfunct3miss the ADD/SUB distinction (both have opcode0110011;funct7[5]distinguishes them). The seeded-failure drill in Lab 5.5 catches this.
Weeks 6-7
- Forward reference edge case. Students who write their test programs without forward references won't discover the two-pass requirement. Require at least one test with a label used before it is defined (the
test-forward-ref.srequirement in Lab 6.1). - Linker vs assembler responsibility. Students sometimes add relocation support to the assembler to avoid needing a linker. This is architecturally wrong; redirect them: the assembler produces relocatable objects; the linker fixes them.
Weeks 8-11
- VM calling convention. The stack-based calling convention (push arguments, call, push return value, pop) is different from 6502's JSR + stack-pull. Students who muscle-memory the 6502 pattern will write a translator that works for simple calls but fails on recursive programs (Lab 9.4).
- Codegen symbol-table scope. Class-level fields vs local variables. Same bug as CSA-101 Week 10-11; same fix (Lab 10.5 Ghidra comparison). If you ran CSA-101, you know where to look.
Weeks 12-13
- Virtus OS compliance failures. The compliance suite is identical to CSA-101's. Students who did CSA-101 have an advantage here; watch for students who copy their CSA-101 OS implementation verbatim without adapting it to RV32I-Lite register conventions.
- VCP integration. Same peripherals as CSA-101; same integration issues. The Peripheral IP Pack documentation covers both the 6502 and RV32I-Lite sides.
Toolchain Diary continuity
The Toolchain Diary pattern starts in Week 4 and runs through Week 14. By Week 14 the student should have ~30 entries. Many of these are CSA-102 tools revisited for RV32I-Lite; each such entry should note both the CSA-102 and CSA-110 encounter and record the difference.
Pattern per entry: tool name; what it does; when you used it; one surprise; how it compares to the CSA-102 encounter (if applicable). ~3-5 sentences per entry.
New tools that first appear in CSA-110 (not in CSA-101 or CSA-102):
riscv64-linux-gnu-as— the GNU RISC-V assembler; used for round-trip verificationriscv64-linux-gnu-objdump— disassembler; used against the student's own assembler output- Ghidra with
RISC-V:LE:32:RV32Iprocessor module — RISC-V disassembly iverilogandvvp(carry-forward; entry notes CSA-110-specific usage patterns)
Grading rubric for the capstone
Three dimensions equally weighted (per CAPSTONE.md):
- The console works (33%). Bitstream flashes; capstone program boots; demo video shows real output.
- The toolchain reproduces (33%). A grader can rebuild
program-built/program.binbyte-for-byte fromtoolchain/+program/; the SHA-256 matches. - The write-up is honest (33%). Five sections show actual reflection; Section 2 names real gaps; Sections 4-5 show specificity about the RV32I-Lite vs 6502 experience.
Use the capstone rubric template at worksheets/TEMPLATE-capstone-rubric.md. The CSA-110-specific addition: a fourth rubric dimension worth up to 10 bonus points for students who produce an exceptionally specific CSA-101-vs-CSA-110 comparison (Section 4 write-up).
When students get stuck
Behind by Week 6 (assembler): Almost always a forward-reference confusion or a B-type encoding error. The round-trip against riscv64-linux-gnu-as is the diagnostic; run it early.
Behind by Week 10 (compiler): The recursive-descent structure should be familiar from CSA-102. If not, have the student re-read their CSA-102 tokenizer. The structures are isomorphic; the difference is the code-generation target.
Behind by Week 12: Offer to extend the capstone delivery. The Virtus OS integration work is the same as CSA-101 Week 13; CSA-101 graduates who get to Week 12 rarely fail to finish.
Forward-pointers
- CSA-201. Full RV32I; M extension; privileged ISA; MMU; PMP; register allocator; peephole. Direct continuation; CSA-110 Week 14 is CSA-201 Week 1 prerequisites.
- CON-101. Virtus Console as game-dev platform. Uses CSA-110's Virtus OS as the target OS.
- RE-101. SB6141 reverse engineering. Uses CSA-110's Week 4 encoding fluency and Week 5 silicon instinct; the 32-bit instruction format makes RISC-V binaries easier to disassemble by hand than 6502 binaries.
Logistical notes
- No new hardware required. Students carry their Tang Primer 25K forward from CSA-101. Emphasize this at enrollment — it reduces the friction of signing up.
- Cohort pairing with CSA-101 cohorts. Because CSA-110 covers the same hardware (Tang Primer 25K) and the same OS destination (Virtus Console), it can share office-hours slots with CSA-101 in Weeks 1-7. After Week 7 the toolchain topics diverge.
- Capstone grading turnaround. 10 days from submission.
Week 8: VM I — Stack Arithmetic
Opening hook. Start with a demonstration: write push constant 7; push constant 8; add on the board. Ask: "Where does the result live?" Students who did CSA-102 will say a named memory location. Correct them: it lives at the top of a hardware stack — a pointer-managed region of BRAM.
Pacing table.
| Segment | Time | What to cover |
|---|---|---|
| Stack model + segment map | 40 min | Draw the stack diagram; name the six segment bases and their register anchors |
| Push/pop translation | 50 min | Walk through translate_push('constant', 7) line by line; build translate_pop together |
| Arithmetic and comparison | 40 min | Arithmetic: straightforward; comparison: true=-1 encoding needs explanation |
| Lab setup | 10 min | Confirm student can run translator.py SimpleAdd.vm before leaving |
Three common issues.
- Comparison true = -1, not 1. Students who expect true = 1 will pass simple tests but fail when the result of a comparison is used in
and/or. Explain: -1 is all bits set; this makesandandorwork on boolean results without special cases. - Temp segment absolute address. Students sometimes use an offset from x18 for temp. Redirect: temp is at absolute BRAM address 0x0300; it is not relative to any base pointer.
- Label counter for comparisons. Students who forget the label counter will have label collisions in programs with multiple comparisons. The symptom: the second
eqinstruction jumps to the label from the firsteq. Require the seeded-error drill.
Petzold weave. Ch 17, "Automation," pp. 223-232. The VM is the automation layer: it lets the compiler write push constant 7 without knowing which register or memory address will hold the value. This is exactly Petzold's argument for why each abstraction layer is worth its complexity cost.
Lab 8 timing note. Lab 8.2 (SimpleAdd) takes 30-45 minutes. Lab 8.3 (StackTest) takes 60-90 minutes for students who encounter the comparison-encoding issue. Budget 30 minutes of buffer for that issue.
Week 9: VM II — Function Calls
Opening hook. Ask the student to trace what happens, instruction by instruction, when factorial(1) calls itself. Most students can answer for one level of recursion; almost none can answer correctly for two levels without drawing the stack. Draw it. The moment the student sees that each frame is independent is the moment the calling convention makes sense.
Pacing table.
| Segment | Time | What to cover |
|---|---|---|
| Program flow (label/goto/if-goto) | 30 min | Mechanical; students often get this in one pass |
| The frame layout diagram | 40 min | Draw the full frame; label every slot; ask the student to label the second frame when factorial calls itself |
| translate_call / translate_function | 50 min | Code walkthrough; emphasize ARG repositioning formula derivation |
| translate_return | 40 min | Why x18 restore must be last; trace through the restore sequence |
Three common issues.
- ARG repositioning off-by-one. The formula is
sp + (5 + n_args) * 4. Students who writesp + 5 * 4orsp + n_args * 4will produce wrong ARGs for any non-trivial call. NestedCall.vm (Lab 9.3) catches this reliably. - Return address saved from the wrong location. The return address is 5 slots above x18 (LCL), not 1 slot. Students who read the lecture pseudocode without tracing the frame diagram get this wrong. Require the frame layout drawing in the Toolchain Diary.
- x18 restored first instead of last. If x18 is restored before x19-x21, the restore of x21, x20, x19 reads from the new (callee's) x18 instead of the caller's frame. The symptom:
THISandTHAThave garbage values after return. Ask the student to hand-tracetranslate_returnwith x18 restored in each possible order.
Petzold weave. Ch 25, "The World Brain," pp. 361-378. The calling convention you implement this week is the lowest rung of the abstraction ladder Petzold describes. Read the passage aloud with the student: "Each layer takes for granted the services of the layer below." The VM calling convention is what the compiler takes for granted.
Lab 9 timing note. Lab 9.4 (recursive factorial) is the capstone of the week. Allow 90 minutes. Students who get the ARG repositioning wrong will get incorrect output for factorial(5) — specifically, a value that is not 120 and not obviously wrong (it may be 15 or 30, depending on the bug). The diagnostic is always the frame-layout drawing.
Week 10: Compiler I — Syntax Analysis
Opening hook. Show the student let x = a + b * c; and ask: "What does the compiler see?" Then show them what the tokenizer produces (a flat list of tokens) and what the parser produces (a tree). The jump from flat list to tree is the entire content of the week.
Pacing table.
| Segment | Time | What to cover |
|---|---|---|
| Comment stripping + character classification | 30 min | Walk through _strip_comments regex; name the four character classes |
| The grammar: class/subroutine/statement | 60 min | Write the grammar on the board; circle the recursive rules |
| Recursive descent: one method per rule | 50 min | Live-code parse_class and parse_subroutine_dec together |
| Expression/term mutual recursion | 40 min | This is the hardest part; slow down; draw the call graph |
Three common issues.
- Off-by-one in expression parsing. The
expressionrule isterm (op term)*. Students who writewhile self.tok.peek().value in opswithout checkinghas_more()will crash on end-of-file. Require the seeded-error drill (Lab 10.4). - String literals not handled. Students skip string literals in the tokenizer because "Square.vl doesn't have any strings." Then Lab 11.3 (Counter.vl with Output.printInt) fails. Require a string-containing test early.
- Mutual recursion surprises.
parse_expressioncallsparse_term, which callsparse_expressionfor parenthesized expressions. Students who are not expecting this are surprised when a deeply-nested expression causes a Python recursion limit. Not a real issue for the programs in this course; mention it and move on.
Petzold weave. Ch 24, "Languages High and Low," pp. 337-360. The tokenizer and parser are the first two steps of the translation Petzold describes. The student is now implementing the machinery Petzold describes abstractly.
Lab 10 timing note. Lab 10.3 (parse Square.vl and diff against reference) is the gating exercise. Expect 60-90 minutes. Every student discovers at least one grammar rule they misread the first time; the diff is the diagnostic.
Week 11: Compiler II — Code Generation
Opening hook. Ask the student to write, by hand, the VM bytecode for let x = a + b; where x is a local variable and a, b are instance fields. Most students get the push/pop direction wrong for at least one variable. Correct them using the symbol table: the kind determines the segment; the index determines the offset.
Pacing table.
| Segment | Time | What to cover |
|---|---|---|
| SymbolTable: two scopes, define, lookup | 40 min | Live-code the two-dict structure; demonstrate shadow behavior |
| CodeGen: compile_class and compile_subroutine | 50 min | Constructor/method preamble difference is the key point |
| compile_let / compile_if / compile_while | 50 min | If and while use the same negated-condition branch pattern |
| Subroutine calls: method vs function vs constructor | 40 min | The push pointer 0 before method calls; the call Memory.alloc 1 for constructors |
Three common issues.
- Symbol-table scope bug. A local variable with the same name as a class field shadows the field correctly at lookup, but students often define them in the wrong scope (define a local in the class scope, or define a field in the subroutine scope). The symptom:
this.xandxcompile to the same segment. Ghidra (Lab 11.5) is the prescribed diagnostic. - Void return value. Methods that return
voidstill need apush constant 0; returnin the VM. Students who omit the push leave garbage on the stack, which the caller then discards withpop temp 0. Correct in isolation; catastrophic in composition. - Constructor preamble omitted. Students who forget
push constant n_fields; call Memory.alloc 1; pop pointer 0produce constructors that writethisfields to address 0. The symptom: a segfault-equivalent (writing to BRAM address 0).
Petzold weave. Ch 22, "The Operating System," pp. 295-332. The code generator emits calls to Memory.alloc for constructors — calls to the OS that the OS has not been written yet. This is the moment the student understands why the OS and the compiler are designed together.
Lab 11 timing note. Lab 11.3 (Counter.vl end-to-end) is the first time the full stack runs: VirtusLang source → VM → assembly → simulation → correct output. Allow 90-120 minutes for students who need to debug the symbol table. Lab 11.5 (Ghidra) requires 30-45 minutes of setup if the student has not used Ghidra before.
Week 12: Compiler III — OS-Aware Compilation
Opening hook. Ask: "What happens when your compiler emits call Output.printInt 1 and there is no Output.printInt in the compiled VM?" Walk through the resolution chain: compiler emits the symbol; VM translator leaves it unresolved; assembler records it in the .reloc table; linker resolves it against the Virtus OS stdlib. This is the same chain that produces libc.so.6 in a Linux binary.
Pacing table.
| Segment | Time | What to cover |
|---|---|---|
| Multi-file compilation: driver pattern | 30 min | compiler.py with glob; why file order matters (or doesn't) |
| OS call signatures and argument counting | 40 min | OS_SIGNATURES dict; trace a Output.printInt(x) call through the compiler |
| End-to-end pipeline: 3 files → binary | 50 min | Walk through the shell script; emphasize each stage's role |
| HDMI vs UART output path | 20 min | HDMI: framebuffer write; UART: character-art escape sequence |
Three common issues.
- Compiler.py hardcoded paths. Students who use
../virtus-os/as a hardcoded path will fail the Tier 1 gate if the grader runs from a different directory. Require path arguments from the command line. - Wrong argument count for OS calls. A student who miscounts arguments for
Screen.drawRectangle(4 arguments) will produce a VM call with the wrongn_args. The linker resolves the symbol; the VM calling convention passes the wrong frame size; the result is a corrupted stack. This is a hard bug to find without theOS_SIGNATURESdict. - Object file order sensitivity. The linker places sections in command-line order. If the student's program assumes
Main.vofis linked beforeBall.vof, a different order will produce a wrong binary. Require the student to sort input files explicitly.
Petzold weave. Ch 25, "The World Brain," pp. 361-378, second visit. The complete abstraction ladder is now visible: source → tokens → tree → VM → assembly → binary → silicon. Petzold's hierarchy from relay to transistor to gate to chip to CPU to OS to language maps exactly onto what the student has built.
Lab 12 timing note. Lab 12.2 (compile and link Pong) is the validation exercise. Expect 60-90 minutes. Most time is spent debugging one stage of the pipeline; the diagnostic is to run each stage independently and verify its output before proceeding to the next.
Week 13: Virtus OS
Opening hook. Ask: "Every call Output.printInt your compiler has emitted for three weeks — who implements printInt?" The answer is: you do, this week.
Pacing table.
| Segment | Time | What to cover |
|---|---|---|
| Math.multiply (shift-and-add) | 40 min | Trace through twoToThe table; explain the O(log N) vs O(N) complexity |
| Memory.alloc (first-fit free list) | 50 min | Draw the free-list structure; trace alloc, then deAlloc, then alloc again |
| Output (char bitmap, cursor) | 30 min | The font table + printChar are the whole story |
| Screen, Keyboard, GamePad, Sys | 40 min | These are simpler; spend more time on the VCP integration setup |
| Compliance suite | 20 min | Walk through the suite command and expected output |
Three common issues.
- Math.divide with x < 0. The recursive divide implementation in the lecture handles positive x only. Students who do not handle the sign need to wrap in
Math.absguards. The compliance suite testsMath.divide(-10, 3). - Memory.alloc on a full heap. Students who do not call
Sys.error(6)on allocation failure will return -1, which the caller will use as a valid pointer. The compliance suite tests this by exhausting the heap. - Virtus OS copied verbatim from CSA-101 without RV32I-Lite adaptation. Permitted if clearly attributed. The most common adaptation failure:
twoToThe[15]must be 16384 (not 32768) because 32768 overflows a 16-bit signed integer. Students who used a 16-bit target in CSA-101 already know this; students who used a wider integer may not.
Petzold weave. Ch 22, "The Operating System," pp. 295-332, final visit. Read the Petzold passage on memory management with the student. Then show the Memory.alloc free-list diagram side by side with Petzold's description. The Virtus OS is a complete implementation of the OS that Petzold describes abstractly.
Lab 13 timing note. Lab 13.1 (Math.vl + Memory.vl compliance) is the gate: do not let the student proceed to Lab 13.2 until 13/13 pass. The compliance suite is fast (< 2 minutes per run); students can iterate quickly. Lab 13.4 (VCP integration) requires the hardware; schedule hardware office hours before the lab due date.
Week 14: Capstone
Opening hook. No lecture needed. This is the week the student ships what they built. Begin with one question: "Does your CPU produce sum-to-N = 55 in simulation?" If the answer is no, address it before anything else.
Pacing table.
| Day | Instructor check-in | What to ask |
|---|---|---|
| 1 | CPU health | "Sum-to-N = 55 in simulation?" |
| 3 | Tier 1 gate dry run | "Run it from /tmp and paste me the sha256sum output" |
| 5 | Write-up Section 4 | "Read me your first CSA-101 comparison point" |
| 7 | Final submission | "Unzip it in a clean directory and run the gate" |
Three common issues.
- Tier 1 gate SHA-256 mismatch due to non-deterministic output. Assembler or linker that processes sets instead of lists may produce different output on different runs. Require deterministic ordering everywhere.
- Demo video shows only a terminal with no program output. Redirect the student to record actual hardware output (UART or HDMI) rather than a terminal showing a build command.
- Section 4 write-up is generic. "RV32I-Lite is RISC" is not a comparison. Ask the student to cite a specific program, a specific instruction count, or a specific failure mode. If they cannot, have them open their Toolchain Diary and read their Week 4 or Week 9 entries — the specific observations are there.
Capstone Tier 1 gate check-in protocol.
Per student, in the last office hours before submission close:
- Have the student unzip their submission into
/tmp/{name}-check/ - Run the gate command from
toolchain/:python3 compiler/compiler.py ../program/Main.vl -o /tmp/main.vm && \ python3 vm-translator/translator.py /tmp/main.vm -o /tmp/main.s && \ python3 assembler/asm.py /tmp/main.s -o /tmp/main.vof && \ python3 linker/linker.py /tmp/main.vof virtus-os/*.vof -o /tmp/prog.bin && \ sha256sum /tmp/prog.bin
- Compare to
program-built/program.bin.sha256 - If mismatch: ask the student to run
diff <(sha256sum program-built/program.bin) <(sha256sum /tmp/prog.bin)and trace the first divergent byte
Instructor guide v0.2. Full Week 8-14 notes added 2026-05-30.