The assembler produces one object file per source file. The linker combines multiple object files into a single flat binary by resolving inter-file symbol references and patching the relocation entries the assembler left as placeholders. CSA-102's flat 6502 binary did not need a linker because everything assembled in one pass at known addresses. The RV32I-Lite stack separates assembler from linker because the standard-library functions live in separate object files.
Reading
- Petzold weave anchors. Ch 17 (fourth visit): Petzold's discussion of how the programmer graduated from hand-managing addresses to delegating that work to tools; Ch 22 (Operating Systems, p. 299) first citation: Petzold describes how the OS loads programs -- the loader is the runtime sibling of the linker; Ch 24 (Languages High and Low, p. 333) second citation: the abstraction ladder that separates source code from its address assignment. ~25 pages.
- VOF v1 spec. handouts/vof-v1-spec.md: re-read sections 3 (
.symtab) and 4 (.reloc) which your linker processes.
Lecture
3 hours. Key arc:
The linking problem. Your Week-6 assembler emits main.vof for the main program. It contains a call to Output.printInt, which lives in output.vof (part of the Virtus OS stdlib). The call instruction in main.vof encodes a relative jump offset, but the assembler cannot compute this offset at assemble time because it does not know where output.vof will be placed in memory. It records a relocation entry: "at address X in .text, there is a JAL instruction that needs to be patched with the offset to symbol Output.printInt when the final binary is known."
Symbol resolution. The linker reads all input VOF files and builds a global symbol table:
class StaticLinker:
def __init__(self):
self.symbols = {} # name -> (vof_file, section, offset)
self.vof_files = []
self.load_address = 0x0000_0000 # start of text segment
def load(self, vof_path: str):
vof = parse_vof(vof_path)
self.vof_files.append(vof)
for sym in vof.symtab:
if sym.is_global and sym.name in self.symbols:
raise LinkerError(f"Duplicate symbol: {sym.name}")
if sym.is_global:
self.symbols[sym.name] = (vof, sym.section, sym.offset)
def layout(self):
# Assign final addresses: concatenate all .text sections, then .data
addr = self.load_address
for vof in self.vof_files:
vof.text_base = addr
addr += len(vof.text)
for vof in self.vof_files:
vof.data_base = addr
addr += len(vof.data)
self.total_size = addr - self.load_address
Relocation patching. After layout, each .reloc entry can be resolved:
def relocate(self):
for vof in self.vof_files:
for reloc in vof.reloc:
# reloc.symbol is the name of the target
target_vof, section, offset = self.symbols[reloc.symbol]
if section == 'text':
target_addr = target_vof.text_base + offset
else:
target_addr = target_vof.data_base + offset
patch_addr = vof.text_base + reloc.offset
# Compute the PC-relative offset for JAL or call pseudoinstruction
pc_relative = target_addr - patch_addr
# Patch the instruction word in the text section
instr_word = int.from_bytes(vof.text[reloc.offset:reloc.offset+4], 'little')
patched = apply_reloc(instr_word, reloc.type, pc_relative)
vof.text[reloc.offset:reloc.offset+4] = patched.to_bytes(4, 'little')
The flat binary output. After resolution and patching, concatenate all .text sections, then all .data sections, into a single binary file: the input to the Tang Primer 25K BRAM initializer.
Booting two files. The simplest possible two-file link: main.s calls a function defined in util.s. Assemble both. Link them. Boot the result on silicon. This is the first time two separately-assembled programs run as one.
The 6502 comparison. The Py6502v compiler in CSA-102 assembled everything into a single flat binary for the 6502's fixed address space. The stdlib was either copied into the output or linked at known fixed addresses using a simple symbol-patch pass. There was no VOF format and no relocation table because the 6502's address space was small enough to assign all addresses at assembly time. The RV32I-Lite's separation of assembly from linking is a prerequisite for the VM translator and compiler in later weeks, which need to emit calls to standard-library functions without knowing their final addresses.
Lab exercises
Five labs in labs/lab-7.md. Plan for ~5 hours.
- Lab 7.1. Write
linker.py:load(),layout(),relocate(), andemit_binary()methods. Use the VOF v1 spec for the data format. - Lab 7.2. Link
sum-to-N.vofagainstmath-stub.vof(a stub for the Math service). Verify the output binary is identical to the manually-concatenated version from Lab 6.4. - Lab 7.3. Boot the linked binary on the Tang Primer 25K and observe the sum-to-N result on UART. This is the first end-to-end: source → assembler → linker → binary → silicon.
- Lab 7.4 (forward promise: register allocator). The linker currently places all text sections in the order the input files were specified. This means the code layout affects the jump distances in all branch and jump instructions. Write a two-paragraph note in your Toolchain Diary predicting what a register allocator in CSA-201 will need from the linker to place hot code paths near each other. This note is the "forward promise" that CSA-201 Module 3 closes.
- Lab 7.5. Deliberately break one relocation entry (change the symbol name in the reloc table to a non-existent symbol) and run the linker. Verify it raises
LinkerError: undefined symbol. This is your first undefined-symbol debugging session; it will happen for real in Week 13 when the stdlib calls are wired in.
Independent practice
- Read Petzold Ch 22 (first 15 pages). Petzold describes the OS loader, which is the runtime version of the linker: it reads the binary, assigns the correct virtual addresses, and starts execution. Your linker is the static (build-time) sibling of this loader.
- Read Bryant and O'Hallaron, Computer Systems: A Programmer's Perspective, Chapter 7 (Linking), Sections 7.1-7.5. This is the most thorough treatment of static linking available at the undergraduate level; Sections 7.1-7.5 cover exactly what your linker does.
- Compare your linker's symbol table format against the ELF
.symtabformat. Note one thing ELF does that VOF v1 elides, and one thing they share exactly.
Architecture comparison sidebar
Static linking vs dynamic linking vs Py6502v's flat model.
CSA-102's Py6502v compiler linked everything statically into a flat binary for the 6502. "Statically" here means all library code was included in the output file and all addresses were resolved at build time. The binary was self-contained; no runtime loader had to resolve anything.
RV32I-Lite's CSA-110 linker is also a static linker: all code from all object files is included in the output binary and all addresses are resolved before the binary runs. The advantage over CSA-102's flat approach is that multiple object files can be assembled independently and linked later, so you can ship a stdlib as pre-assembled .vof files and link them with user code without recompiling the stdlib.
Dynamic linking (ELF shared libraries, Windows DLLs, macOS dylibs) defers some symbol resolution to load time or runtime. When a Linux program calls printf, it links against libc.so.6 but does not include printf's code in the executable. At load time, the dynamic linker resolves the printf symbol by searching the loaded shared libraries. At runtime, each call to printf goes through a trampoline (the PLT, Procedure Linkage Table) that was patched by the dynamic linker at load time.
CSA-110 uses static linking because it produces simpler, more auditable binaries and avoids the runtime overhead of dynamic linking. CSA-201 introduces shared libraries as an advanced topic once you understand why static linking works.
Reflection prompts
- Your linker uses a dictionary to store global symbols. What happens if two object files export a symbol with the same name? Your Lab 7.1 linker should raise a LinkerError; Linux's
lduses "duplicate symbol" as a compiler error too. What design decision would allow two symbols to have the same name without conflict? - The VOF v1 relocation table records each location that needs patching. What would happen if a patching operation was applied twice to the same location? (This can happen if the linker processes the same input file twice due to a bug.) Design a test for this bug.
- Lab 7.3 was the first time you booted code that two separate source files contributed. What would have happened at run time if the linker had placed the
.textsections in the wrong order? Why doesn't the order matter for correctness here, even though it matters for branch-range?
What's next
The hardware (Weeks 1-5) is done. The assembler and linker (Weeks 6-7) are done. Week 8 begins the VM translator: a translation layer that sits above assembly and below the compiler. The VM translator takes a stack-based bytecode (similar in spirit to JVM bytecode) and emits RV32I-Lite assembly. This is the same structural position as the Py6502v compiler in CSA-102, but now with a clean intermediate representation between the compiler front end and the assembly back end.