The bytes carry meaning. By the end of the week you can hand-encode ten RV32I-Lite instructions into 32-bit words, decode ten words back into mnemonics, write a complete sum-to-N program in assembly, reconcile your handwork against a real toolchain, and load your binary into Ghidra to see what a disassembler sees.
This is the chapter that establishes the course's voice. The chapter-prose for Ch 4 is the canonical template every subsequent chapter follows.
Reading
- Chapter prose (primary). draft-chapters/ch4-machine-language-prose.md. The canonical voice template; read carefully
- Petzold weave anchors. Ch 17 Automation (first citation of the automation thread; introduces "let the bytes carry meaning"); Ch 19 Two Classic Microprocessors (opcode tables in hex; the 8080
MOVbit pattern01dddsssshows that structure-through-encoding is universal). ~32 pages - Cross-chapter handouts. RV32I-Lite encoding card, the instruction-set contract for everything from here forward
Lecture
lectures/ch4-machine-language-lecture.md. 3 hours. Key arc:
- The instruction is a 32-bit word. Different bits mean different things depending on the opcode
- The four RV32I instruction formats: R-type (register-register), I-type (register-immediate), S-type (store), U-type (upper immediate). Each format packs the bits differently
- RV32I-Lite is a teaching subset: 11 instructions plus 8 pseudo-instructions. You can hold the whole ISA in your head
- The assembly mnemonic is a one-to-one translation of the encoded word. Assembly is not magic; it is just a more readable spelling of the same bits
- Forward pointer to Ch 6: the assembler is the tool that does the translation for you
Encoding card at a glance
Figure 4.1. The pinup the lectures and the labs both point at. Same content as the ASCII tables in the encoding card handout, drawn to scale so the bit-position regularities (opcode always at [6:0]; rs1 always at [19:15]; sign bit always at 31) read off the page. The amber field is the immediate Lab 4.1 exercises first.
Lab exercises
Six labs in worksheets/ch4/. Lab 4.5 is the first Ghidra encounter of the course; it returns in Ch 6, Ch 6a, Ch 10.
- lab-4.1-hand-encode-ten-instructions.md
- lab-4.2-hand-decode-ten-words.md
- lab-4.3-sum-to-n-assembly.md
- lab-4.4-real-toolchain-reconciliation.md, confirm your encoding matches
riscv32-unknown-elf-as - lab-4.5-ghidra-first-encounter.md, the first time the student sees a real disassembler
- lab-4.6-rv32i-encoder.md, Tier-1 companion: drive the RV32I-Lite encoder/decoder tool through R/I/S/B-format hand-encodes plus one hex decode; predict-then-verify each bit field.
Plan for ~6 hours of lab. Lab 4.6 is a Tier-1 calibration that pairs naturally with Lab 4.1 and Lab 4.2; many students find taking 4.6 between 4.1 and 4.3 makes the bit-field shuffles automatic by the time the sum-to-N program ships.
Independent practice
- Read Petzold Ch 17. This is the first of six visits to Ch 17 across the course; future weeks return for different theses
- Read Petzold Ch 19 pp. 271-272. Notice the 8080
MOVencoding01dddsss. The structure-through-encoding insight is universal across ISAs - Update your Toolchain Diary. Week 4 introduces: hand-encoding by reading the encoding card,
xxdfor hex dumps,riscv32-unknown-elf-as,riscv32-unknown-elf-objdump, and Ghidra (your first professional reverse-engineering tool)
Architecture comparison sidebar
RV32I-Lite has 11 instructions plus 8 pseudo-instructions. Full RV32I has 47 instructions. x86_64 has thousands (the legacy encoding is variable-length and accumulated over 40 years). ARMv8 sits between (~400 instructions in the base ISA). The RV32I-Lite count is a teaching choice; CSA-201 expands you to full RV32I, where you meet the rest of the ladder.
Reflection prompts
- The four instruction formats pack bits differently. Why didn't the RV32I designers pick one format and pad?
- Petzold's 8080 has
01dddsssfor MOV; RV32I-Lite has different bit positions for register fields. Both work; both ship. What does this say about the freedom designers have when they pick an encoding? - Hand-encoding by reading the encoding card is slow and error-prone. Why did the course just spend two hours making you do it?
What's next
Week 5 combines the hardware from weeks 1-3 with the instruction set from week 4. You wire the instruction decoder; you integrate it with your ALU and register file; you synthesize the full CPU to a Tang Primer 25K bitstream; you flash it; you run the sum-to-N program you wrote this week on silicon you designed yourself.