Control flow, conditional jumps, loops as backward jumps, switch statements as jump tables. The patterns that let you reconstruct C from disassembly.
Reading (~45 min)
From Yurichev RE4B: read the chapters on "If-then-else," "Loops," and "Switch statement." Yurichev walks through each pattern -- the C source, the compiler output at various optimisation levels, and the conceptual mapping. Read at least the x86-64 sections; the 32-bit sections are useful but optional for RE-011.
From OST2 Architecture 1001: complete the control flow and flags modules.
Lecture outline (~1.5 hr)
Part 1: The flags register and conditional jumps (25 min)
x86-64 has a FLAGS register whose individual bits record the result of the most recent arithmetic or comparison instruction. The bits relevant to control flow:
- ZF (zero flag): set if the result was zero
- SF (sign flag): set if the result was negative (high bit set)
- CF (carry flag): set if the operation produced a carry (unsigned overflow)
- OF (overflow flag): set if the operation produced signed overflow
The cmp a, b instruction subtracts b from a and sets the flags without storing the result. The test a, b instruction performs a bitwise AND and sets the flags.
After cmp or test, conditional jump instructions read the flags:
| Instruction | Condition | Typical C |
|---|---|---|
je / jz |
ZF=1 | if (a == b) |
jne / jnz |
ZF=0 | if (a != b) |
jl / jnge |
SF != OF | if (a < b) (signed) |
jle / jng |
ZF=1 or SF != OF | if (a <= b) (signed) |
jg / jnle |
ZF=0 and SF == OF | if (a > b) (signed) |
jge / jnl |
SF == OF | if (a >= b) (signed) |
jb / jnae |
CF=1 | if (a < b) (unsigned) |
ja / jnbe |
CF=0 and ZF=0 | if (a > b) (unsigned) |
js |
SF=1 | result is negative |
jns |
SF=0 | result is non-negative |
The signed vs. unsigned distinction (e.g., jl vs. jb) matters when reconstructing C: it tells you whether the operands are signed or unsigned values in the original source.
Part 2: Control flow patterns (30 min)
if/else:
cmp rdi, 0 ; if (arg1 == 0) ...
je .else_branch
; ... then branch code ...
jmp .end
.else_branch:
; ... else branch code ...
.end:
Reading this in disassembly: find the cmp or test, find the conditional jump, follow both paths. The fall-through path (no jump) is the "then" branch; the jump target is the "else" branch (or vice versa if the condition is inverted). A jmp at the end of the "then" branch skips over the "else" branch.
Loops -- all loops are backward jumps:
A loop in assembly is just a conditional jump that goes backward (to a lower address, or more precisely, to an address before the current instruction pointer). Any time you see a conditional jump pointing backward in the disassembly, assume a loop until proven otherwise.
; while (counter < limit) { body; counter++; }
mov ecx, 0 ; counter = 0
.loop_top:
cmp ecx, edi ; counter < limit?
jge .loop_exit ; if not, exit
; ... loop body ...
inc ecx ; counter++
jmp .loop_top ; back to condition check
.loop_exit:
For-loop pattern: same structure; init before the loop, increment at the bottom, condition at the top. Do-while: body comes before the condition check; the backward jump is always taken at least once.
test rax, rax / test rdi, rdi: This is how compilers check for zero or null pointer. test rax, rax ANDs rax with itself (result = rax); if zero, ZF=1. You will see this constantly in place of cmp rax, 0. Recognize it immediately as a null check or zero check.
sete, setne, setl, etc.: These set a byte register to 0 or 1 based on a flag condition. sete al is equivalent to al = (ZF == 1). Common when the comparison result is stored in a variable rather than immediately branched on.
Part 3: Switch statements as jump tables (20 min)
A C switch statement with many cases is often compiled to a jump table: an array of addresses where each entry corresponds to one case value. The compiled pattern:
; switch (n) { case 0: ...; case 1: ...; case 2: ...; }
cmp rdi, 2 ; range check: is n > 2?
ja .default ; if so, jump to default
lea rax, [rip + table]
movsxd rax, DWORD PTR [rax + rdi*4] ; load entry from table
add rax, rax_base ; adjust (relative table encoding)
jmp rax ; jump to the case handler
.table:
.long case_0 - .table
.long case_1 - .table
.long case_2 - .table
In Ghidra, jump tables are recognized automatically and shown in the listing view as a computed CALL or JUMP with an arrow pointing to each possible target. The decompiler shows them as switch statements. In raw objdump output, the jmp rax looks like a dynamic dispatch -- you need to find the table reference to understand all possible targets.
The presence of a jump table in a binary tells you: this function has a multiway branch with 3+ cases, and the cases were dense enough that the compiler chose table lookup over a chain of comparisons.
Lab exercises (~1.5 hr)
Lab 5: Assembly-to-C reconstruction
See labs/lab-5-assembly-to-c.md for the full specification.
You are given a stripped binary containing a 50-instruction function with no source code. Using objdump -d and the control-flow patterns from Weeks 4-5, you reconstruct a plausible C source for the function. You label each pattern you identify (if/else, loop, comparison type) and explain your reasoning. Ghidra's decompiler is available as a cross-check; you produce your own reconstruction first, then compare.
CrackMe ladder
Solve at least one more CrackMe from your Week 4 attempt. Document in your Tool Journal: what the check function does (in control-flow terms), where the key comparison happens, and what the correct input is. You are now reading disassembly to find the check; that is the core RE-011 skill.
Independent practice (~3 hr)
- Yurichev RE4B: Read the "Arrays," "Structures," and "Working with strings" chapters. These come up in the Ghidra weeks.
- Tool Journal: Add a control-flow pattern reference. Four entries: if/else pattern, while loop pattern, do-while pattern, jump table indicator. For each: what the assembly looks like, what the C looks like, how you tell the difference.
- CrackMe ladder: Attempt a second CrackMe or continue with the Week 4 challenge. Document your progress regardless of whether you crack it.
Reflection prompts
-
The
jlinstruction (jump-if-less) uses signed comparison, whilejbuses unsigned comparison. If you seejbin a disassembly, what does that tell you about how the original C source treated the operands? Give an example where getting the signed/unsigned distinction wrong would cause you to misread the function's behavior. -
Loops in assembly are backward jumps. A disassembler does not know the difference between a loop and a goto. In C,
gotois considered bad practice; in assembly, all backward jumps look the same. What does this mean for the reliability of the C reconstruction you produce in Lab 5? -
Compiler optimisation at
-O2often eliminates the frame pointer (rbp), usingrsp-relative addressing instead. What is the consequence for a reverse engineer who is trying to identify local variables? What information did they have with the frame pointer that they no longer have without it?
Week 5 of 14. Next: Ghidra I -- project setup, the auto-analyser, navigation, and the decompiler view.