Classroom Public page

Week 4: Machine Language

779 words

The bytes carry meaning. By the end of the week you can hand-encode ten RV32I-Lite instructions into 32-bit words, decode ten words back into mnemonics, write a complete sum-to-N program in assembly, reconcile your handwork against a real toolchain, and load your binary into Ghidra to see what a disassembler sees.

This is the chapter that establishes the course's voice. The chapter-prose for Ch 4 is the canonical template every subsequent chapter follows.


Reading

  • Chapter prose (primary). draft-chapters/ch4-machine-language-prose.md. The canonical voice template; read carefully
  • Petzold weave anchors. Ch 17 Automation (first citation of the automation thread; introduces "let the bytes carry meaning"); Ch 19 Two Classic Microprocessors (opcode tables in hex; the 8080 MOV bit pattern 01dddsss shows that structure-through-encoding is universal). ~32 pages
  • Cross-chapter handouts. RV32I-Lite encoding card, the instruction-set contract for everything from here forward

Lecture

lectures/ch4-machine-language-lecture.md. 3 hours. Key arc:

  • The instruction is a 32-bit word. Different bits mean different things depending on the opcode
  • The four RV32I instruction formats: R-type (register-register), I-type (register-immediate), S-type (store), U-type (upper immediate). Each format packs the bits differently
  • RV32I-Lite is a teaching subset: 11 instructions plus 8 pseudo-instructions. You can hold the whole ISA in your head
  • The assembly mnemonic is a one-to-one translation of the encoded word. Assembly is not magic; it is just a more readable spelling of the same bits
  • Forward pointer to Ch 6: the assembler is the tool that does the translation for you

Encoding card at a glance

RV32I-Lite encoding card. Six 32-bit-wide bit-field strips: R, I, S, B (real in RV32I-Lite) and U, J (reserved for CSA-201). The I-format immediate field is amber-highlighted because the worked example `addi x1, x0, 5` lives in that field.

Figure 4.1. The pinup the lectures and the labs both point at. Same content as the ASCII tables in the encoding card handout, drawn to scale so the bit-position regularities (opcode always at [6:0]; rs1 always at [19:15]; sign bit always at 31) read off the page. The amber field is the immediate Lab 4.1 exercises first.

Lab exercises

Six labs in worksheets/ch4/. Lab 4.5 is the first Ghidra encounter of the course; it returns in Ch 6, Ch 6a, Ch 10.

Plan for ~6 hours of lab. Lab 4.6 is a Tier-1 calibration that pairs naturally with Lab 4.1 and Lab 4.2; many students find taking 4.6 between 4.1 and 4.3 makes the bit-field shuffles automatic by the time the sum-to-N program ships.

Independent practice

  • Read Petzold Ch 17. This is the first of six visits to Ch 17 across the course; future weeks return for different theses
  • Read Petzold Ch 19 pp. 271-272. Notice the 8080 MOV encoding 01dddsss. The structure-through-encoding insight is universal across ISAs
  • Update your Toolchain Diary. Week 4 introduces: hand-encoding by reading the encoding card, xxd for hex dumps, riscv32-unknown-elf-as, riscv32-unknown-elf-objdump, and Ghidra (your first professional reverse-engineering tool)

Architecture comparison sidebar

RV32I-Lite has 11 instructions plus 8 pseudo-instructions. Full RV32I has 47 instructions. x86_64 has thousands (the legacy encoding is variable-length and accumulated over 40 years). ARMv8 sits between (~400 instructions in the base ISA). The RV32I-Lite count is a teaching choice; CSA-201 expands you to full RV32I, where you meet the rest of the ladder.

Reflection prompts

  1. The four instruction formats pack bits differently. Why didn't the RV32I designers pick one format and pad?
  2. Petzold's 8080 has 01dddsss for MOV; RV32I-Lite has different bit positions for register fields. Both work; both ship. What does this say about the freedom designers have when they pick an encoding?
  3. Hand-encoding by reading the encoding card is slow and error-prone. Why did the course just spend two hours making you do it?

What's next

Week 5 combines the hardware from weeks 1-3 with the instruction set from week 4. You wire the instruction decoder; you integrate it with your ALU and register file; you synthesize the full CPU to a Tang Primer 25K bitstream; you flash it; you run the sum-to-N program you wrote this week on silicon you designed yourself.