Classroom Public page

Week 7: How a Program Runs

1,315 words

The machine in motion. Fetch, decode, execute. The program counter ticks forward instruction by instruction. A branch sends it somewhere else. A store writes a value to RAM. Five instructions on paper, traced completely, is worth more than a vague understanding of "running a program."


Theme

Programs are sequences of bytes stored in memory. The CPU fetches those bytes one instruction at a time, decodes what the instruction means, and executes the appropriate operation. This is the fetch-decode-execute (FDE) cycle; it is the foundational loop that every CPU has run since the 1940s.

This week you trace 5-7 simple instructions through the FDE cycle by hand, tracking the state of the PC, IR, registers, and memory at each step. By the end, you will have a mechanical understanding of how a program "runs" that is much more concrete than "the computer does it."

Reading list (~1 hour)

  1. Petzold CODE, Ch 17 ("Automation"): Petzold constructs a simple stored-program computer step by step; read the complete chapter this time (you read the intro in week 6)
  2. Petzold CODE, Ch 21 ("Get on the Bus"): the 8080 instruction set at a high level; a real CPU's FDE cycle
  3. Optional preview: courses/csa-101/ch4-machine-language-outline.md (if available): the CSA-101 Ch 4 voice template covers the RV32I-Lite ISA at this same level

Lecture outline (~2 hours)

Section 1: The fetch-decode-execute cycle

Three steps per instruction, repeating indefinitely:

  1. Fetch: CPU reads the byte(s) at the address in the PC. Loads them into the IR. Increments the PC by the instruction size (4 bytes for a fixed-width ISA; variable for x86).
  2. Decode: Control unit reads the IR. It parses the opcode (which operation?) and operand fields (which registers? which memory address? which immediate value?). It generates control signals for step 3.
  3. Execute: The ALU or memory interface does the work specified by the decoded instruction. The result is written to a destination register or memory address. Optional: PC is modified if the instruction is a branch or jump.

Repeat from step 1.

Section 2: A minimal instruction set for tracing

For this week's lab you will use a simplified 8-instruction set:

Instruction Meaning
LOAD r, addr r = memory[addr]
STORE addr, r memory[addr] = r
ADD r1, r2, r3 r1 = r2 + r3
SUB r1, r2, r3 r1 = r2 - r3
MOV r1, imm r1 = immediate value
CMP r1, r2 set flags based on r1 - r2 (does not store result)
JMP addr PC = addr
JEQ addr if zero flag is set, PC = addr; else continue

These are enough to write a loop, a conditional branch, and a memory store. The CSA-101 RV32I-Lite instruction set is richer but uses the same concepts.

Section 3: Tracing ADD through the FDE cycle

Example program state:

  • Memory location 0x00: ADD r0, r1, r2 (instruction encoding doesn't matter for this trace)
  • r1 = 5, r2 = 3, PC = 0x00

Trace:

  • Fetch: CPU reads instruction at memory[0x00] into IR. PC becomes 0x04 (or 0x01 in a byte-addressed minimal CPU; we'll say word-addressed).
  • Decode: Control unit reads IR. Recognizes ADD; source registers r1 and r2; destination r0.
  • Execute: ALU reads r1 (5) and r2 (3). Computes 5+3=8. Writes 8 to r0. Zero flag is not set (result is nonzero).

After execute: PC=0x04 (next instruction), r0=8, r1=5, r2=3, all unchanged except r0.

Section 4: Tracing a conditional branch

Program fragment:

  • 0x00: MOV r0, 0 (r0 = 0; initialize counter)
  • 0x04: MOV r1, 5 (r1 = 5; loop limit)
  • 0x08: ADD r0, r0, r2 (r0 = r0 + r2; accumulate)
  • 0x0C: SUB r1, r1, r3 (r1 = r1 - 1; decrement counter; assume r3 = 1)
  • 0x10: CMP r1, r4 (compare r1 to 0; assume r4 = 0)
  • 0x14: JEQ 0x18 (if zero, jump to 0x18; else fall through to 0x08; wait, that goes backward)

For a "jump backward" loop: JEQ would jump forward to 0x18 (end), and JMP 0x08 would loop back. This is the structure of a for-loop at the machine level: branch-not-taken continues the loop; branch-taken exits.

Key insight: the only difference between "loop 5 times" and "loop forever" is where the branch condition is set.

Section 5: Stack and call/return

  • A call instruction saves the return address (PC+1 or PC+4) somewhere, then jumps to the called function
  • A return instruction loads the saved address back into the PC
  • The most common place to save the return address is the stack: a region of memory used in LIFO order
  • Stack pointer (SP): a register that tracks the current top of the stack. PUSH decrements SP then writes to memory[SP]. POP reads from memory[SP] then increments SP.
  • Function call: PUSH (PC+4) [saves return address]; JMP function_address
  • Function return: POP (PC) [restores return address]; effectively: PC = memory[SP]; SP++

The RV32I-Lite jalr instruction implements this in one step: it saves PC+4 into a link register and jumps to a target address simultaneously.

Forward pointer to CSA-101 Ch 7 (VM I). The stack-as-LIFO discipline above carries forward into the Virtus VM stack machine, where push and pop are the two primitive operations every other op composes on top of. If you want to see a stack execute one op at a time before CSA-101 lands you in front of it, the academy's VM Stack Machine Simulator (companion worksheet at worksheets/ch7/lab-7.5-vm-stack-machine.md) is the same abstract model rendered as a pure-browser tool. The CSA-101 worksheet is the structured drill; the tool itself can be poked freely at any time after this week's reading.

Labs (~90 minutes)

Lab 7.1: Fetch-Decode-Execute Trace on Paper (labs/lab-7-1-fetch-decode-execute.md)

  • Given a 7-instruction program using the minimal ISA from Section 2, trace each instruction through FDE
  • For each step: write the state of PC, IR, r0-r3, zero flag, and relevant memory locations
  • Identify which instruction changes the PC to a non-sequential value
  • Artifact: completed trace table committed to Git

Independent practice (~4 hours)

  1. Write a 10-instruction program in the minimal ISA that sums the integers 1 through 5 (using a loop). Trace the first 3 iterations of the loop through FDE.
  2. What happens if the JMP instruction at address 0x20 jumps to address 0x20 itself? Trace one cycle. What does the machine do next, and the cycle after that?
  3. Read the Wikipedia article "Von Neumann architecture" (specifically the "Operation" section). How does the FDE description there compare to the one in this week's lecture?
  4. Petzold CODE Ch 21 covers the 8080. Pick one 8080 instruction and trace it through FDE using Petzold's description.
  5. Preview for week 8: open a terminal (any operating system). Type ls. What happened? Write 2 sentences in your Toolchain Diary describing what you know so far about that process.

Reflection prompts (~30 minutes)

  1. After tracing 7 instructions by hand, does "running a program" feel different to you than it did before? What changed?
  2. A CPU running at 3 GHz completes about 3 billion FDE cycles per second. Your hand trace of 7 instructions took ~30 minutes. What does that tell you about the value of hardware vs human tracing?
  3. The JEQ instruction changes the PC conditionally. This is the fundamental mechanism of all branching: if, while, for, function calls. Does it feel like enough? What would be missing without it?
  4. The stack is "just memory with a convention about how to use it." What is the convention, and why does sticking to it matter?
  5. You have now traced a program through hardware. When you next write code in any language, what layer of the machine do you now have a model for?

What comes next

Weeks 8 and 9 shift to practical skills: the Linux shell. You will learn to navigate a Unix filesystem, search text with grep, pipe commands together, and write a simple bash script. These are the tools every software and security professional uses every day.