RE-011 Week 6: Ghidra I · RE-011 · Virtus Cyber Academy Classroom

Project setup, the auto-analyser, the symbol tree, the listing view, the decompiler. Navigation discipline and naming conventions that make a project readable months later.

Reading (~30 min)

Read the NSA Ghidra documentation introduction (available in the Ghidra installation at docs/GhidraClass/Beginner/; the PDF is titled "Introduction to Ghidra Student Manual"). Read through the "Ghidra Overview" and "Working with Programs" sections (approximately the first 40 pages). This is the official reference; the lecture builds on it.

If you prefer a community resource: the OpenSecurityTraining2 "Ghidra 101" module covers the same material in video form and is free at ost2.fyi.

Lecture outline (~1.5 hr)

Part 1: What Ghidra is and why it matters (10 min)

Ghidra is an open-source reverse engineering framework released by the NSA in 2019. It is the industry's most capable freely available disassembler and decompiler. Before Ghidra, the standard tool was IDA Pro (expensive, proprietary). RE-011 uses Ghidra as its primary static analysis tool; radare2 is the Week 8 alternative.

Ghidra's key capabilities:

Disassembler: converts binary code to assembly listing (like objdump, but interactive and persistent)
Decompiler: converts assembly to pseudo-C (heuristic, not perfect, but often close enough to follow logic quickly)
Symbol tree: navigates functions, imports, exports, strings, data
Cross-references: shows every call site of a function, every reference to a data item
Scripting: Python and Java APIs for automating analysis
Collaboration: multi-user analysis projects (outside RE-011 scope)

System requirement: 8 GB RAM minimum; 16 GB recommended for large binaries. Ghidra is a Java application. Install the JDK before Ghidra (see SETUP.md).

Part 2: Project setup and importing a binary (20 min)

First run:

Launch Ghidra (ghidraRun or the desktop shortcut).
Create a new project: File > New Project. Choose "Non-Shared Project." Name it after the binary or the analysis session.
Import a binary: File > Import File. Ghidra detects the format (ELF, PE, Mach-O) and suggests the processor. For an x86-64 ELF, it should detect x86:LE:64:default:gcc. Accept.
Open the binary in the CodeBrowser (double-click the imported file).

Auto-analyser: Ghidra prompts "Analyze all open files?" Accept (click "Yes" then "Analyze"). The analyser runs for seconds to minutes depending on binary size. It:

Identifies function boundaries
Disassembles code
Identifies data and strings
Sets up the decompiler
Applies known library signatures (FLIRT signatures) if available

Never skip the auto-analyser in RE-011. If it misidentifies something, you can correct it; starting without it means you do the analyser's work manually.

Part 3: The Ghidra UI -- four key panels (25 min)

Program Tree (top left): Shows the binary's sections (.text, .data, .rodata, etc.). Double-click a section to navigate to it.

Symbol Tree (bottom left): The most useful navigation panel. Contains:

Imports: library functions the binary calls (from .dynsym). Browsing imports gives you a quick behavioral profile of the binary.
Exports: functions the binary exports (relevant for libraries).
Functions: every function Ghidra identified, named with original symbols if unstripped, FUN_xxxxxxxx if stripped.
Labels: jump targets, data labels, string references.

Listing View (center): The disassembly. Shows address, bytes, mnemonic, operands, and optionally cross-reference comments. Click any instruction to select it. G or Ctrl-G to navigate to a specific address.

Decompiler View (right): Shows the pseudo-C for the currently selected function. Updates as you navigate. Double-click a function call in the decompiler to jump to that function. The decompiler is your fastest way to understand what a function does; the listing view is your ground truth.

Navigation: Three ways to navigate to a function:

Symbol Tree > Functions > double-click the function name.
In the decompiler, double-click a called function's name.
G in the listing to jump to a specific address.

Part 4: Naming discipline (15 min)

When you open a stripped binary, every function is FUN_xxxxxxxx. Your job is to rename them as you understand them. This is not optional -- it is how you make a multi-hour analysis session accumulate value instead of dissipating it.

Renaming in Ghidra: right-click a function name in the listing or symbol tree, select "Edit Function Signature," change the name. Or press L when a function name is selected.

Naming conventions that work:

Be specific, not vague. check_password is better than verification. parse_json_field is better than string_handler.
Include confidence. maybe_decrypt means you are not sure. definitely_encrypt is a stronger claim. Use prefixes like maybe_ and looks_like_ when uncertain.
Preserve hierarchy. If a function is a helper for another, name it _helper suffix or with the parent name: validate_input_length is clearly a helper for validate_input.
Never delete. If you realize a name is wrong, rename it; do not just remove it. WRONG_validate_input is more useful than FUN_00401230 because it tells you you already looked at it.

Every function you visit in this course should leave with a name other than FUN_xxxxxxxx. If you cannot name it, name it analyzed_unknown_FUN_xxxxxxxx and move on.

Lab exercises (~1.5 hr)

Lab 4: Ghidra navigation

See labs/lab-4-ghidra-navigation.md for the full specification.

You import a provided binary (approximately 200 lines of C compiled and stripped) into Ghidra, run the auto-analyser, navigate to main using all three navigation methods, and rename every non-trivial function (anything that does visible work, not just wrapper stubs) using the naming discipline from this week. You submit a screenshot of your symbol tree and a written explanation of what the binary does based on function names and the decompiler view.

CrackMe ladder

Attempt a CrackMe using Ghidra as the primary tool this week. Document in your Tool Journal: which Ghidra view gave you the key insight, whether the decompiler matched what you read in the listing, and how long navigation took before you oriented yourself.

Independent practice (~3 hr)

Tool Journal: Document Ghidra. Four entries: how to create a project and import a binary, the four UI panels and what each shows, three navigation keyboard shortcuts, your personal naming convention rules.
Apply to Week 5 CrackMe: Open your Week 5 CrackMe in Ghidra and rename all functions. Compare the decompiler's output to your manual reconstruction from Lab 5. Where did the decompiler help? Where did it mislead?
Explore imports: For any binary you have imported, open Symbol Tree > Imports. Read what library functions it calls. What does the import list tell you about the binary's purpose?

Reflection prompts

Ghidra's decompiler produces pseudo-C, not actual compilable C. What does "pseudo-C" mean in practice? Give two examples of things the decompiler shows that would not appear in real C source code, and explain why the decompiler cannot recover them.
The naming discipline rule says "never delete; if wrong, rename." Why is a wrong name preferable to no name for a function you have already analyzed?
Ghidra's auto-analyser sometimes misidentifies data as code and tries to disassemble it, producing garbage instructions. How would you recognize this situation in the listing view? What would you do to correct it?

Week 6 of 14. Next: Ghidra II -- cross-references, data-type inference, struct recovery, and the decompiler as a conversation.