Classroom Public page

RE-011 Week 9: Dynamic Analysis

1,159 words

gdb, strace, ltrace, and valgrind. When static analysis hits a wall, you run the binary -- carefully, in a controlled environment, with a specific hypothesis to test.


Reading (~45 min)

Read the gdb documentation introduction (sourceware.org/gdb/documentation). Focus on: setting breakpoints, running the program, examining registers and memory, stepping through instructions. You do not need to read the full manual -- the "Getting Started" and "Breakpoints" sections are sufficient.

From Yurichev RE4B: read the "GDB" chapter. Yurichev's examples show what gdb output looks like for the same binaries he analyzed statically in earlier chapters. This before/after structure is valuable: you see what static analysis missed and what dynamic analysis revealed.


Lecture outline (~1.5 hr)

Part 1: When to use dynamic analysis (15 min)

Static analysis fails or becomes very slow in three situations:

  1. Self-modifying code or runtime decryption: The binary decrypts its own code at startup. The static disassembly shows encrypted bytes that are meaningless. You have to observe the decrypted code after it runs. (This is common in packed malware; RE-011 will see simpler versions in CrackMe challenges.)

  2. Complex algorithmic logic: A hash function, a serial number generator, a checksum algorithm. It is theoretically possible to reverse-engineer any algorithm from its assembly, but sometimes it is faster to put the function under gdb, feed it test inputs, and observe the output to understand the transformation.

  3. Unknown data dependencies: A function's behavior depends on external state (a file, a registry key, an environment variable, a network response) that you cannot observe statically.

In all three cases: dynamic analysis is a tool you reach for when static has given you all it can. You form a specific hypothesis from static analysis, then run the binary to test it. You do not just "run the binary and see what happens" -- that is not analysis.

Safety rule: In RE-011, dynamic analysis is done against:

  • CrackMe binaries from crackmes.one (designed to be safely run)
  • Course-provided lab binaries
  • Your own compiled code

Never run an unknown binary from an untrusted source on a machine you care about. For malware analysis (ADV-101), you use isolated VMs with no network access. That is not RE-011's scope.

Part 2: gdb fundamentals (35 min)

gdb (GNU Debugger) lets you run a binary under controlled conditions: set breakpoints, step through instructions, inspect registers and memory.

Basic workflow:

gdb ./binary          # launch
(gdb) break main      # set breakpoint at function 'main'
(gdb) run             # run the program; stops at breakpoint
(gdb) info registers  # show all registers
(gdb) x/10xw $rsp     # examine 10 words at the stack pointer
(gdb) next            # step over next instruction (source level)
(gdb) nexti           # step over next instruction (assembly level)
(gdb) step            # step into next call
(gdb) stepi           # step into next call (assembly level)
(gdb) continue        # continue running until next breakpoint
(gdb) disassemble main # show disassembly of main

Setting breakpoints:

(gdb) break *0x401136     # breakpoint at specific address
(gdb) break strcmp        # breakpoint at libc strcmp
(gdb) info breakpoints    # list all breakpoints
(gdb) delete 1            # delete breakpoint #1

Examining memory:

(gdb) x/s 0x402008      # examine as string (x=examine, /s=string format)
(gdb) x/16xb $rdi       # examine 16 bytes at rdi as hex bytes
(gdb) x/4gx $rsp        # examine 4 giant-words (8 bytes each) at rsp
(gdb) print $rax        # print rax value
(gdb) print (char*)$rdi # print rdi as a C string

gdb enhancement plugins (optional but recommended):

  • pwndbg: Adds a persistent register/stack/code display on every stop. Most widely used in CTF work. Install: pip install pwndbg or follow pwndbg.re.
  • peda: Similar persistent display. Older but still common.
  • gef: Another option, slightly heavier but with more features.

These plugins do not change gdb's commands; they add persistent displays that make the default gdb experience much less painful.

Part 3: strace and ltrace (20 min)

strace traces every system call the binary makes. A system call is a request to the kernel: open a file, read bytes, write bytes, create a socket, etc. strace intercepts these at the kernel boundary.

strace ./binary           # trace all syscalls
strace -e read,write ./binary  # trace only read/write
strace -o trace.txt ./binary   # write output to file

Reading strace output:

openat(AT_FDCWD, "/etc/passwd", O_RDONLY) = 3
read(3, "root:x:0:0:", 4096) = 1895
close(3) = 0

This tells you the binary opened /etc/passwd, read 1895 bytes of it, and closed it -- without your having to read a single line of assembly. strace is often the fastest way to understand what a binary is doing at the OS level.

ltrace traces library function calls (as opposed to system calls). It intercepts calls to shared libraries:

ltrace ./binary

Sample output:

strcmp("letmein", "hunter2") = -1
strcmp("letmein", "s3cr3t")  = -1

The ltrace output from a CrackMe often directly shows you the string being compared -- revealing the key in plaintext. This is why CrackMe anti-RE techniques (Week 10) often include obfuscating string comparisons.

Note: this approach only works when (a) the binary calls a libc string-comparison function (strcmp, strncmp, memcmp) and (b) the expected key is stored in memory in plaintext at the time of the call. Lab 7's obfuscated target violates one of these two conditions -- determining which one is part of the exercise.

Part 4: valgrind (10 min)

valgrind is a memory analysis tool: it instruments the binary to track every memory allocation and access, reporting heap overflows, use-after-free, memory leaks, and uninitialized reads.

valgrind ./binary
valgrind --leak-check=full ./binary

In RE-011 context, valgrind is useful for understanding memory behavior: is this buffer heap-allocated? How large is it? Is there an obvious overflow in the check function? Valgrind reports these without your having to instrument the source.

Note: valgrind significantly slows execution (10-50x). It is a diagnostic tool, not a performance tool.


Lab exercises (~1.5 hr)

Lab 7: Dynamic vs. static

See labs/lab-7-dynamic-vs-static.md for the full specification.

You are given a lightly-packed or obfuscated CrackMe where static analysis alone cannot determine the key. You first perform static analysis (Ghidra + objdump), document what you can determine statically, then apply dynamic analysis (gdb, strace, ltrace) to resolve what static analysis could not. You write a structured comparison: what each approach revealed, what each approach missed, and why the static-first posture still holds (static analysis first gave you the hypothesis; dynamic analysis confirmed it).

CrackMe ladder

Attempt a CrackMe that uses strace or gdb visibility as part of your approach. Document in your Tool Journal: what hypothesis did you form statically, what command did you use dynamically, what did the dynamic output tell you that static analysis could not.


Independent practice (~3 hr)

  • Tool Journal: Document strace, ltrace, and the gdb commands from this week. For strace: what system call categories you can filter for and why. For ltrace: why it is a shortcut and why it does not always work (dynamically-linked functions only; static binaries will show nothing).
  • gdb breakpoint exercise: Open a binary you have already analyzed statically in gdb. Set a breakpoint at the check function you identified. Run the binary, reach the breakpoint, and inspect rdi and rsi at the breakpoint. What are they? Do they match what you expected from static analysis?
  • strace survey: Run strace /bin/ls 2>&1 | head -30. Read the first 30 syscalls. What is the first file the ls binary opens? Why does a program that lists a directory need to open /etc/passwd or /etc/nsswitch.conf? (It often does -- look for it.)

Reflection prompts

  1. ltrace output from a CrackMe shows: strcmp("user_input", "s3cr3t") = -1. This reveals the key directly. A developer who wants to prevent this would avoid strcmp and implement their own string comparison. How would you detect a custom string comparison in a disassembly? What patterns would you look for?

  2. Static analysis is a hypothesis generator; dynamic analysis is a hypothesis tester. Give a concrete example from your CrackMe work where static analysis gave you a wrong hypothesis, and dynamic analysis corrected it. If you have not yet had a case like this, describe one that could plausibly occur.

  3. Running an unknown binary from an untrusted source is dangerous. But professional malware analysts do it every day. What infrastructure makes this safe for a professional that you do not have in RE-011? What additional steps would you take before running a truly unknown binary?


Week 9 of 14. Next: Anti-RE tricks -- packing, obfuscation, anti-debug, and how to recognize each one.