You built a CPU. You built a compiler. You built an OS. This week you demonstrate it.
Reading
No new reading this week. Revisit:
CSA-101 week 14 (bridge talk notes). The "what's next" section from your CSA-101 capstone named the costs CSA-201 would recover. Compare that list against what you actually built. How many items did you close?
CAPSTONE.md (this course). Full rubric. Read it end to end before you begin integration work. Know the six Tier 1 gates before you start.
CONTINUATION.md (now primarily historical reference). The continuation note listed what week-7 through week-14 needed to deliver. Your build filled those files.
Lecture: Integration, Validation, and Delivery
What the capstone is asking
The capstone is not a new implementation task. It is an integration and demonstration task. Every component was built in Modules 1-13. The capstone asks: do they work together? Can you demonstrate each of the six Tier 1 gates on real silicon?
The integration work is non-trivial. In the lab, each module was developed and tested in isolation. The full system has more interactions than any single module test covered:
- The MMU's page-fault handler allocates a new page via Memory.alloc. Does the GC run correctly if triggered during a page-fault handler?
- The scheduler context-switches between two processes. Does SFENCE.VMA run correctly when switching from Process A's page table to Process B's?
- The SSD1306 driver uses I2C which requires precise timing. Does the timer interrupt (scheduler tick) corrupt the I2C bit-bang timing if it fires mid-transaction?
These interactions are the integration bugs. Finding and fixing them is the real work of week 14.
The six gate checklist
Work through the gates in order. Each gate depends on the previous ones being stable.
Gate 1: OS boots. Cold-power the DE10-Nano. Start the stopwatch. The OLED shows the boot message. Stop the stopwatch. Under 10 seconds? Gate 1 passes.
Common issues:
- Quartus bitstream is synthesized from a clean project but the flashing fails. Verify with the USB-Blaster II programmer; confirm the bitstream reports 100% loaded.
- The boot message takes more than 10 seconds because DDR3 initialization (via HPS) is slow. Solution: display the boot message before HPS DDR3 initialization completes; the OLED I2C runs off on-chip BRAM and does not need DDR3.
Gate 2: U/S transition demonstrated. Add a SignalTap tap on the priv register (2-bit signal). Set trigger: priv == 2'b00 (U-mode). Run for 2 seconds. The capture should show: priv transitions from 2'b00 to 2'b11 (U->M) at the ECALL instruction, then 2'b11 to 2'b00 (M->U) at MRET.
Note: if you implemented S-mode (optional extension), the gate accepts U->M or U->S. Document which privilege levels your OS uses.
Gate 3: Page-fault handler running. Write a 20-line user-mode test program: access an unmapped virtual address (e.g., 0x80000000, which is outside the initial page table). The page-fault handler should either: (a) map the page on demand and resume the instruction, or (b) log a message to the OLED and kill the process cleanly.
Common issue: the page-fault handler itself faults (it accesses an unmapped kernel address). The kernel page table must be complete before the user page table is installed.
Gate 4: PMP W^X enforced. Write a test program that calls SYS_MMAP to get a new user page (read/write, not executable). Write 4 bytes to it. Then attempt to branch to the written address. Verify mcause=1 (instruction access fault) fires before the branch executes. The SignalTap tap should show pmp_fault asserted at the fetch to the data page.
Gate 5: Round-robin scheduler. Task A increments a counter and writes "TASK A:
Gate 6: SSD1306 OLED. The display must show the OS version string, both process names (Task A and Task B), and at least one live-changing value. Verify against the OLED; a dark or flickering display fails this gate.
Write-up advice
Section 1 (architecture): draw a real diagram of your address space layout. Physical addresses on the left: BRAM region, DDR3 region. Virtual addresses on the right: user text, user stack, user heap, kernel map. Label which PMP entries cover which physical regions.
Section 2 (what each gate required): be specific. "Gate 4 required adding u_pmp_fetch to the CPU's fetch pipeline stage and wiring pmpcfg_packed from the csrfile to the PMP unit" is better than "I implemented PMP."
Section 4 (what does not work): every real capstone has gaps. Common ones:
- ENC28J60 driver is initialized but does not successfully send/receive packets (SPI timing issue).
- The GC stop-the-world has a race condition: the scheduler tick fires during the mark phase.
- The FAT16 walker reads files but does not handle long filenames.
- The shadow stack CFI is implemented but not enabled because it interferes with the GC stack scan.
Name these clearly. A specific known limitation grades better than "everything works" that a grader then disproves.
The bridge talk
In CSA-101 week 14, the bridge talk previewed CSA-201. In CSA-201 week 14, the bridge talk points forward to the Part-II electives.
VCA-ARM-201. The same OS architecture you built this week, but targeting the DE10-Nano's HPS Cortex-A9 (AArch64). The HPS has hardware floating-point, out-of-order execution, and a 32 KB L1 / 512 KB L2 cache hierarchy. Your Virtus OS v2 design maps almost directly.
VCA-EMB-201. Take the SSD1306, SD card, and ENC28J60 drivers you wrote and build a real embedded application: a network-connected data logger that reads from a sensor, writes to the SD card, and transmits over Ethernet. The driver skills from Module 12 are the direct prerequisite.
VCA-NET-201. The ENC28J60 driver you built is a raw Ethernet interface. VCA-NET-201 adds the full network stack above it: IP, TCP, UDP, ARP. The OS's driver architecture is the foundation.
VCA-X86-201. The same privilege mode separation, virtual memory, and PMP concepts you implemented in RISC-V, now applied to x86_64. The conceptual architecture is identical; the encoding is more complex.
Lab: Capstone integration
See CAPSTONE.md for the full specification. There is no separate lab file for Module 14; the capstone IS the lab.
Pre-submission checklist:
- All six Tier 1 gates demonstrated on DE10-Nano silicon (not just in simulation)
- SignalTap captures for Gates 2, 4 (mode transitions, PMP fault)
- OLED showing at least three live values simultaneously
-
mcyclemeasurements present for: mul speedup, trap round-trip, context-switch, GC cycle - Demo video recorded (3-5 minutes)
- Write-up: all six sections present, Section 4 (what doesn't work) is honest
Common last-minute issues:
- USB-Blaster II driver not recognized on Linux: install
sudo apt-get install libudev1 libudev-devand re-plug. - OLED doesn't display after power cycle: I2C address conflict (check whether A0 pin on SSD1306 module is grounded; selects between 0x3C and 0x3D).
- Timer interrupt fires during I2C transaction, corrupts bit-bang timing: disable interrupts (
csrc mstatus, 8to clear MIE) around I2C byte transmit sequences; re-enable after. - Memory corruption after adding DDR3 heap: page table not updated to map new DDR3 physical pages; kernel accesses DDR3 address that PMP blocks.
Independent practice (integration week)
-
Stress test Gate 5 (scheduler): create a third task, Task C, that does nothing except call SYS_YIELD in a loop. Verify that Task A and Task B's counter advancement rates are unchanged (Task C does not starve them and does not run more often than it should in round-robin).
-
Measure the E2E GC overhead in the full system: allocate 200 objects from Task A's process, drop references to 100, and call SYS_GC. Record the time from SYS_GC entry to return, as seen by Task B's timer (Task B should observe a scheduling gap during stop-the-world GC).
-
Toolchain Diary final entry:
riscv32-unknown-elf-size. Run it on your final Virtus OS v2 kernel binary. Record the text, data, and bss sizes. Compare against Virtus OS v1 from CSA-101. -
Document one integration bug you found and fixed during capstone week. What was the symptom? What was the root cause? What was the fix? This is Section 5 material (what surprised you) but write it before the demo so you don't forget the details.
Reflection prompts
-
CSA-101 asked: "what does your toolchain do?" CSA-201 asks: "what does your OS protect and what does it cost?" Write a two-paragraph answer to the CSA-201 question based on your capstone experience. Be specific about which protection mechanisms are hardware-enforced (PMP, MMU) vs software-enforced (canaries, shadow stack) and what the cycle overhead of each is.
-
The capstone requires Virtus OS v2 to be ~4,000 lines vs CSA-101's ~1,500. In CSA-201, which modules added the most lines? Which modules produced the most functionality per line of code? What does this suggest about the right level of abstraction for OS services?
-
The bridge talk points forward to four Part-II electives. If you could only take one, which would it be and why? Your answer should reference something specific you built or measured in CSA-201 that makes you most curious about that direction.
Congratulations
You built a CPU from gates, a compiler from tokens, and an OS from trap handlers. The stack is complete. Every layer from logic gates to Ethernet packet is code you wrote or hardware you wired.
That is what CSA-201 was for.