Classroom Glossary Public page

Week 13: Virtus OS

880 words

Every call to Output.printInt, Screen.drawPixel, or Memory.alloc that your compiler has been emitting for three weeks resolves to a function you write this week. The Virtus OS is nine service classes written in VirtusLang (compiled with your own compiler), assembled with your assembler, and linked against user programs by your linker. When the Tang Primer 25K boots your Week 12 Pong binary with a complete Virtus OS, every layer of the stack is software you wrote — from the NAND gates in Week 1 to the game loop calling the OS.


Reading

  • Petzold Ch 22 ("Operating Systems," pp. 295-332) final visit: Read the full chapter. Petzold describes the complete operating system concept — memory management, I/O, process management. The Virtus OS implements a subset: memory management (Memory), I/O (Output, Screen, Keyboard, GamePad), and system control (Sys). The concepts are identical even though Virtus OS does not support multitasking. (~37 pages)

Lecture

3 hours. Key arc:

The nine services. The Virtus OS comprises nine required services:

Service Responsibility Key methods
Math Integer arithmetic multiply, divide, sqrt, abs, max, min
String String creation and manipulation new, dispose, length, charAt, appendChar, intValue, setInt
Array Heap-allocated arrays new, dispose
Memory Heap memory management alloc, deAlloc, peek, poke
Output Text output to display printChar, printString, printInt, println, moveCursor
Screen Pixel graphics drawPixel, drawLine, drawRectangle, drawCircle, clearScreen, setColor
Keyboard Keyboard input keyPressed, readChar, readLine, readInt
GamePad Gamepad input (Virtus-specific) buttonPressed, readAxes
Sys System control and timing halt, error, wait

Math.multiply — the canonical example. The correct implementation uses shift-and-add, not repeated addition, to get O(log N) complexity:

// Math.multiply (VirtusLang source)
// Returns x * y without using a hardware multiplier.
// This is the same algorithm your Lab 9.5 VM implementation used.
class Math {
    static Array twoToThe;  // twoToThe[i] = 2^i, precomputed

    function void init() {
        let twoToThe = Array.new(16);
        let twoToThe[0]  = 1;
        let twoToThe[1]  = 2;
        let twoToThe[2]  = 4;
        let twoToThe[3]  = 8;
        // ... (continue through twoToThe[15] = 16384)
        let twoToThe[15] = 16384;
        return;
    }

    function int multiply(int x, int y) {
        var int result, shiftedX, j;
        let result   = 0;
        let shiftedX = x;
        let j        = 0;
        while (j < 16) {
            // if (y & 2^j) != 0: add shiftedX to result
            if (~((y & twoToThe[j]) = 0)) {
                let result = result + shiftedX;
            }
            let shiftedX = shiftedX + shiftedX;  // shiftedX *= 2
            let j = j + 1;
        }
        return result;
    }

    function int divide(int x, int y) {
        var int q;
        if (y > x) { return 0; }
        let q = Math.divide(x, y + y);  // recursive: divide by 2y
        if (x - (q + q) * y < y) {     // remainder < y?
            return q + q;
        } else {
            return q + q + 1;
        }
    }

    function int sqrt(int x) {
        var int j, y, approx, approxSq;
        let y = 0;
        let j = 7;  // 2^7 = 128; works for 16-bit integers (sqrt(32767) < 182)
        while (~(j < 0)) {
            let approx = y + twoToThe[j];
            let approxSq = Math.multiply(approx, approx);
            if (~(approxSq > x) & (approxSq > 0)) {
                let y = approx;
            }
            let j = j - 1;
        }
        return y;
    }
}

Memory.alloc — heap management. The Virtus OS uses a simple free-list allocator:

class Memory {
    // Heap layout: BRAM addresses 0x0800 through 0x3FFF
    // Free list: singly-linked list of free blocks
    // Each block: [size, next-ptr, ...data...]
    static int heapBase;
    static int heapEnd;
    static int freeList;   // pointer to first free block

    function void init() {
        let heapBase = 2048;    // 0x0800
        let heapEnd  = 16383;   // 0x3FFF
        let freeList = heapBase;
        // Initial free block: entire heap
        let Memory.peek(freeList)     = heapEnd - heapBase - 1;  // size
        let Memory.peek(freeList + 1) = -1;                       // no next block
        return;
    }

    function int alloc(int size) {
        // First-fit: walk free list, find first block >= size+2
        var int block, prev, blockSize, next;
        let prev  = -1;
        let block = freeList;
        while (~(block = -1)) {
            let blockSize = Memory.peek(block);
            if (~(blockSize < size)) {
                // Found: split if large enough; else take whole block
                if (prev = -1) {
                    let freeList = Memory.peek(block + 1);
                } else {
                    do Memory.poke(prev + 1, Memory.peek(block + 1));
                }
                return block + 2;  // return address of data portion
            }
            let prev  = block;
            let block = Memory.peek(block + 1);
        }
        // No block found: OS error
        do Sys.error(6);
        return -1;
    }

    function void deAlloc(Array o) {
        // Return block to free list (prepend)
        var int block;
        let block = o - 2;                  // back up to block header
        do Memory.poke(block + 1, freeList);
        let freeList = block;
        return;
    }

    function int peek(int address) {
        // Direct memory read (implementation: VM 'push that N' after setting pointer)
        var Array memory;
        let memory = 0;       // base address 0
        return memory[address];
    }

    function void poke(int address, int value) {
        var Array memory;
        let memory = 0;
        let memory[address] = value;
        return;
    }
}

Output — character display. Output manages a character grid (80 columns × 25 rows for 640×200 screen). Each character is drawn as an 8×8 bitmap from a fixed font table:

class Output {
    static Array charMaps;   // character bitmaps; charMaps[c*8 + row] = 8 bits
    static int cursorCol, cursorRow;

    function void printChar(char c) {
        // Draw character bitmap at (cursorCol, cursorRow)
        var int row, bitmap, x, y;
        let row = 0;
        let x   = cursorCol * 8;
        let y   = cursorRow * 8;
        while (row < 8) {
            let bitmap = charMaps[c * 8 + row];
            // draw 8 pixels horizontally
            do Screen.drawHLine(x, y + row, bitmap);
            let row = row + 1;
        }
        // Advance cursor
        let cursorCol = cursorCol + 1;
        if (cursorCol = 80) {
            let cursorCol = 0;
            let cursorRow = cursorRow + 1;
        }
        return;
    }

    function void printInt(int i) {
        do Output.printString(String.intToString(i));
        return;
    }

    function void println() {
        let cursorCol = 0;
        let cursorRow = cursorRow + 1;
        return;
    }
}

GamePad — the Virtus OS extension. The Virtus peripheral IP pack exposes a gamepad register mapped at BRAM address 0x3FF0. GamePad.buttonPressed(n) reads the button state:

class GamePad {
    // Button register at 0x3FF0
    // Bit layout: [D-down | D-up | D-right | D-left | start | select | B | A]
    static int GP_BASE;

    function void init() {
        let GP_BASE = 16368;   // 0x3FF0
        return;
    }

    function boolean buttonPressed(int button) {
        return ~((Memory.peek(GP_BASE) & Math.twoToThe(button)) = 0);
    }
}

Compliance suite. The Virtus OS compliance suite tests each service:

# Run the compliance suite
python3 tests/os/run_compliance.py --os-dir virtus-os/ --cpu-sim vvp

# Expected output:
# Math:     PASS (8/8 tests)
# String:   PASS (6/6 tests)
# Array:    PASS (3/3 tests)
# Memory:   PASS (5/5 tests)
# Output:   PASS (4/4 tests)
# Screen:   PASS (5/5 tests)
# Keyboard: PASS (2/2 tests -- simulation uses UART input injection)
# GamePad:  PASS (2/2 tests -- simulation uses register mock)
# Sys:      PASS (3/3 tests)
# TOTAL:    38/38

CSA-101 vs CSA-110 Virtus OS. If you completed CSA-101, your Virtus OS was already written — targeting the 6502 register conventions. Porting it to RV32I-Lite changes: the frame layout for function calls (JALR vs JSR), the memory-mapped addresses for peripheral registers (same hardware, different BRAM map), and the integer width (32 bits vs 16 bits). The algorithms — shift-and-add multiply, free-list alloc, character-bitmap display — are identical.


Lab exercises

Three labs in labs/lab-13.md. Plan for ~6 hours.

  • Lab 13.1. Write virtus-os/Math.vl and virtus-os/Memory.vl. Run the Math and Memory compliance tests. Both must pass before proceeding.
  • Lab 13.2. Write virtus-os/String.vl, virtus-os/Array.vl, virtus-os/Output.vl, virtus-os/Screen.vl, virtus-os/Keyboard.vl, virtus-os/GamePad.vl, and virtus-os/Sys.vl. Run the full compliance suite. All 38 tests must pass.
  • Lab 13.3 (VCP integration). Verify at least one hardware-peripheral service on silicon: either GamePad.buttonPressed(0) returns true when button A is held (gamepad path), or Screen.drawPixel(320, 240) illuminates the center pixel on the HDMI display (HDMI path). Record the VCP verification in your Toolchain Diary.

Independent practice

  • Read Tanenbaum, Modern Operating Systems, Chapter 6 ("Deadlocks") §6.1-6.2. Your Memory.alloc implementation does not prevent deadlock (there is only one process, so deadlock cannot occur). Understand why Virtus OS does not need deadlock handling and write one paragraph in your Toolchain Diary describing the simplifying assumption.
  • Compare your Math.multiply to GCC's output for a 32-bit multiply on RV32I (no hardware multiplier). GCC produces a __mulsi3 library call — the same algorithm, generated rather than hand-written.

Architecture comparison sidebar

Virtus OS Memory.alloc vs Linux kmalloc vs C malloc.

Virtus OS uses a first-fit free-list allocator — the same algorithm Knuth describes in TAOCP Vol. 1 §2.5. First-fit is simple and correct; it is also the algorithm with worst-case O(N) allocation time and fragmentation that grows with use. For the Virtus OS use case (no long-running programs, small heap) this is acceptable.

Linux kernel's kmalloc uses a slab allocator for objects smaller than one page. Slabs are pre-allocated pools of fixed-size objects; allocation from a pool is O(1). Fragmentation is controlled because objects of the same size share a slab. The slab allocator is inappropriate for Virtus OS because it requires a much larger heap and more bookkeeping.

C library malloc (glibc) uses dlmalloc/ptmalloc: a segregated free-list with binned size classes, boundary tags for coalescing, and mmap for large allocations. This is the best-known general-purpose allocator; it balances fragmentation, throughput, and locality for real programs. The Virtus OS allocator will fragment badly for workloads that alternate between large and small allocations — this is the specific case the CSA-201 memory management module addresses.


Reflection prompts

  1. The Math.init() function precomputes powers of two into a heap array. What happens if Memory.alloc fails during Math.init because the heap is already fragmented? Is this possible on a freshly booted system? Write the boot sequence that prevents it.
  2. The compliance suite tests GamePad.buttonPressed using a register mock. What would a test miss by using a mock that a hardware test would catch? Name a specific failure mode that only the hardware reveals.
  3. Your Memory.deAlloc prepends the freed block to the free list without coalescing adjacent free blocks. After many alloc/deAlloc cycles, the heap will be fragmented into many small blocks. Write the coalescing logic that merges adjacent free blocks on dealloc.

What's next

Week 14 is the capstone week: no new technical content. It is logistics, polish, and the Tier 1 gate verification that proves the full stack runs end-to-end. The work you do in Week 14 is the same work that ships a product: integration, verification, documentation, and an honest record of what you built and what you left unfinished.