RE-011 Week 1: What Reverse Engineering Is · RE-011

Scope, analytical posture, and legal framing. What RE means, what it does not mean, and why the read-first habit matters from day one.

Reading (~30 min)

Read the Wikipedia article on reverse engineering (the software section), then read the EFF's summary of DMCA Section 1201 security research exemptions (search "EFF DMCA 1201 security research"). The goal is a working mental model of what the activity is and where its legal boundaries sit.

Then browse the crackmes.one "Getting started" FAQ to understand what the platform is: a community of people who intentionally design small binaries for other people to analyze. These are practice targets by design.

Lecture outline (~1.5 hr)

Part 1: What reverse engineering is (20 min)

Reverse engineering is the process of understanding how something works by examining it rather than by reading documentation or source code. In the software context: you have a compiled binary. You do not have the source. You want to understand what the binary does.

This is a normal, necessary activity:

Security researchers analyze malware to write signatures and understand attacker behavior. The malware author did not supply source code.
Vulnerability researchers look for bugs in commercial products. The vendor did not supply source code.
Interoperability engineers figure out how a proprietary protocol works so they can build a compatible client. The protocol author did not publish a spec.
Students and practitioners learn how compilers work, how calling conventions work, how memory is managed -- by reading the output of a compiler directly.

RE is not hacking in the pejorative sense. It is reading. The skill is the ability to look at a sequence of bytes and understand what computation they encode.

Part 2: The analytical posture (25 min)

The most important thing to establish in Week 1 is posture. The posture this course teaches is: read first, run second, never guess.

Read first means static analysis is the default mode. You look at the binary with tools that do not execute it: file, xxd, strings, readelf, objdump, Ghidra. You build a picture of what the binary is before you consider running it.

Run second means dynamic analysis is a tool you reach for when static analysis hits a wall -- when an algorithm is too complex to follow by eye, when data is encrypted and must be observed at runtime, when a function's behavior depends on external state. Running is not the first move.

Never guess means you form hypotheses and test them with evidence. "I think this function validates user input" is a hypothesis. You look for the evidence: what are its arguments, what does it compare, what does it return, what calls it? You do not commit the hypothesis to your notes until you have evidence.

This posture is what separates a practitioner from a script kiddie. A practitioner understands. A script kiddie runs tools and hopes.

The read-first habit is especially important for unknown binaries. Running code you do not understand, from a source you do not trust, on a machine you care about, is how you get infected. The discipline of reading before running is practical safety, not caution theater.

Part 3: Legal framing (20 min)

Reverse engineering exists in legal gray areas in many jurisdictions. The two most relevant bodies of law in the United States:

DMCA Section 1201 (Digital Millennium Copyright Act): Section 1201 generally prohibits circumventing technological protection measures. It has a security research exemption, renewed by the Copyright Office in 2021 (and extended in 2024), that permits circumvention for good-faith security research on devices lawfully acquired. The exemption is not unlimited: it covers research conducted in a controlled setting, on lawfully obtained devices, with findings used to advance security (not to enable unauthorized access). The EFF summary is the readable version; the Federal Register text is the authoritative version.

CFAA (Computer Fraud and Abuse Act): The CFAA prohibits accessing computer systems without authorization. For RE specifically: analyzing a binary you lawfully possess on a machine you own does not implicate the CFAA. Analyzing a binary to find a bug and then testing that bug against a live production system you do not own does implicate the CFAA. The line is: your machine, your binary, your analysis. RE-011 stays on the right side of this line. Everything in this course is done against intentionally-designed training targets (CrackMe binaries) or against locally-installed binaries on hardware you control.

The practical rule: In RE-011 you analyze binaries you have been assigned or binaries you have legally obtained (crackmes.one targets, your own compiled code, the course lab binaries). You do not attempt to apply RE techniques against systems or binaries you do not own or have explicit authorization to test. That authorization requirement is not bureaucratic -- it is the legal and ethical line between security research and unauthorized access.

Part 4: What this course covers -- and what it does not (15 min)

RE-011 covers:

File format identification (ELF, PE, Mach-O basics)
x86-64 assembly reading (not writing; reading)
Static analysis with Ghidra and radare2
Limited dynamic analysis (gdb, strace, ltrace) for cases where static hits a wall
Binary patching at the smallest-necessary scale
Firmware analysis structure (extracting and identifying components from a firmware image)

RE-011 does not cover:

Exploit development (RE-101, ADV-101)
Shellcode or ROP (RE-101, ADV-102)
Malware analysis pipelines (ADV-101)
Firmware extraction from live hardware via JTAG or serial (RE-201)
Kernel-mode analysis (RE-201)

The scope is narrow by design. RE-011 is the scaffolding course. It gives you the vocabulary and tooling fluency that RE-101 assumes you already have.

Lab exercises (~1.5 hr)

Lab 1: File identification

See labs/lab-1-file-identification.md for the full specification.

You are given 10 mystery files with generic names (file-01 through file-10). No extensions. Your task is to identify each one using only file, xxd, and strings -- no opening in an application, no running. For each file: what is it, how do you know (magic bytes, strings, structural evidence), and what would the next analysis step be if you were actually investigating it.

This lab establishes the read-before-run habit on day one. Every file in this course gets identified before it gets run.

Independent practice (~3 hr)

Tool Journal entry: Write your first Tool Journal entry. Document the file command: what it does, what it uses internally (libmagic and magic bytes), one example from Lab 1 where it gave you the answer immediately, one example where you had to look at xxd output to confirm. The Tool Journal is a running document you maintain throughout the course; keep it in a plain text or Markdown file.
Reading: Read the Wikipedia article on the ELF format (the overview section, not the detailed field tables -- those come in Week 3). Build a rough mental model: what does a compiled binary contain? The detailed structure comes later; for now you want the concept.
crackmes.one: Create a free account at crackmes.one. Browse the "Easy" filter. You are not solving anything yet -- you are familiarizing yourself with how the platform works, what difficulty ratings mean, and what languages the challenges are typically compiled from. Download one "Easy" challenge and run file and strings on it. Do not run the binary. Write what you observe.

Reflection prompts

"Reverse engineering is just reading." Do you agree or disagree with this framing? What does it leave out? What does it emphasize?
The legal exemptions for security research under DMCA Section 1201 require that the research be conducted in a "controlled environment" and that the researcher's purpose be advancing security. Who decides whether a given research project meets these criteria? What ambiguity does this create for a working security researcher?
The posture this course teaches is "read first, run second, never guess." Think of a context outside software security where this same posture would be valuable. What does the analogy reveal about why the posture matters?

Week 1 of 14. Next: Byte-level view -- hex editors, magic numbers, endianness.