~60 minutes. Use xxd to look inside a PNG and a ZIP file. Identify the magic bytes that mark each format.
Goal: run xxd (or hexdump -C) on a real file; read the output; identify the file-format magic bytes at the start.
Estimated time: 60 minutes
Prerequisites: Week 1 lecture (hex notation); Week 2 reading is helpful but not required
Setup
mkdir -p ~/fnd-101/lab-1-2
cd ~/fnd-101/lab-1-2
Verify xxd is available: xxd --version. If not installed, use hexdump -C instead (same output format).
Part A: Download test files
Download one small PNG and one small ZIP. Use files you already have on your system or download:
# A small PNG: any image file will do; use one from your system
# Or download a small sample:
curl -o test.png "https://www.gstatic.com/webp/gallery/1.png"
# Create a small ZIP containing a text file:
echo "hello lab 1.2" > hello.txt
zip test.zip hello.txt
Part B: Hex dump the PNG
xxd test.png | head -4
You should see output like:
00000000: 8950 4e47 0d0a 1a0a 0000 000d 4948 4452 .PNG........IHDR
00000010: 0000 0100 0000 0100 0802 0000 0090 9000 ................
In your worksheet file, answer these questions:
- What are the first 8 bytes (in hex)?
- Which bytes spell out "PNG" in ASCII? (Hint: ASCII 'P' = 0x50, 'N' = 0x4E, 'G' = 0x47)
- What is the first byte (0x89) in decimal? Is it in the printable ASCII range (0x20-0x7E)?
- What do bytes 4-7 (0x0D 0x0A 0x1A 0x0A) represent? (Hint: 0x0D = carriage return, 0x0A = newline; look up why these specific control characters are in the PNG signature)
Part C: Hex dump the ZIP
xxd test.zip | head -4
In your worksheet:
- What are the first 4 bytes (in hex)?
- Two of them spell a two-letter abbreviation in ASCII. What is the abbreviation, and what does it stand for? (Research: who invented the ZIP format?)
- The third and fourth bytes (0x03 and 0x04) are not printable ASCII. What are they in decimal?
Part D: Find another file to dump
Pick any file on your system (a .docx, a .mp3, a .pdf, a .jpg). Run xxd on it and look at the first line.
- What format is the file?
- What are the first 4-8 bytes (in hex)?
- Can you find a reference that confirms those bytes are the magic bytes for this format?
Part E: The file command
file test.png test.zip
The file command reads the magic bytes and identifies the format without looking at the extension. Rename test.png to disguised.txt and run file disguised.txt. Does file still identify it correctly? Write one sentence explaining why.
Expected output / artifact
Create lab-1-2-notes.txt with:
- Your answers to Parts A-E
- The raw xxd output for all three files (copy-paste the first 4 lines of each)
git add lab-1-2-notes.txt
git commit -m "lab-1-2: hex dump worksheet"
Common pitfalls
- xxd not found: use
hexdump -Cinstead; or on Windows in WSL,xxdis in thexxdorvim-commonpackage - The PNG I downloaded is too big: use the first 4 lines only (
| head -4); you do not need the whole file - The ZIP header looks different: a ZIP created with different tools may have different options set; the first two bytes (0x50 0x4B, "PK") are always the same regardless
Stretch (optional)
Find the JPEG magic bytes (0xFF 0xD8 0xFF) on a .jpg file. Find the ELF magic bytes (0x7F 0x45 0x4C 0x46, which is 0x7F followed by "ELF") on a Linux executable (try /bin/ls or /usr/bin/python3).
Lab 1.2 v0.1.