Prerequisite: AI-101 setup complete (Python 3.11+, OpenAI/Anthropic SDK, Burp Suite Community, Ollama, garak, PyRIT all installed and verified)
New installs for AI-201
Go 1.21+ (required for Lab 4: CVE-2025-9556)
# Linux / macOS
wget https://go.dev/dl/go1.21.13.linux-amd64.tar.gz
sudo tar -C /usr/local -xzf go1.21.13.linux-amd64.tar.gz
echo 'export PATH=$PATH:/usr/local/go/bin' >> ~/.bashrc
source ~/.bashrc
# Verify
go version
If Go 1.21 is unavailable, any version >= 1.21 is acceptable. The CVE-2025-9556 reproduction requires a version of LangChainGo compatible with the vulnerable release (see Lab 4 for the exact commit hash).
LangChainGo (vulnerable version for Lab 4)
mkdir ~/ai201-labs && cd ~/ai201-labs
mkdir lab-4 && cd lab-4
go mod init lab4
go get github.com/tmc/langchaingo@v0.1.12 # pinned to vulnerable version
faster-whisper (Module 7.5 local ASR)
pip install faster-whisper
# Test the install
python3 -c "from faster_whisper import WhisperModel; print('faster-whisper OK')"
Ollama: LLaVA and Whisper models (Module 7.5)
# LLaVA for visual prompt injection
ollama pull llava:7b
# Small Whisper-compatible model via Ollama
ollama pull whisper:base
# Verify
ollama list
Note: llava:7b is ~4.7 GB. Pull ahead of Module 7.5 to avoid waiting during lab time.
fickling + pickletools (Module 3)
pip install fickling
# pickletools is in Python stdlib; no separate install needed
python3 -c "import pickletools; import fickling; print('pickle tools OK')"
HarmBench evaluation framework (Module 4.5)
HarmBench runs on a cloud GPU. Setup is done in the Colab/Kaggle environment at lab time; no local install required. Verify access to Google Colab with GPU runtime before Module 4.5.
Promptfoo (Module 2 regression testing)
npm install -g promptfoo
# Verify
promptfoo --version
MITRE ATLAS Navigator (Module 1)
Browser-based tool; no install. URL: https://mitre-atlas.github.io/atlas-navigator/
Alternatively, the offline JSON data files are at github.com/mitre-atlas/atlas-data.
virtus-llm-owasp DVLA testbed (Module 2)
git clone https://github.com/virtus-cybersecurity/virtus-llm-owasp ~/virtus-dvla
cd ~/virtus-dvla
pip install -r requirements.txt
Verify you can start the DVLA locally: follow the README in the repo for the startup command. The academy's published L3-regression baseline is in baselines/l3-regression-v2.json.
Verify the full AI-201 environment
Run this before Module 1:
# Python ecosystem
python3 -c "import openai, anthropic, langchain, fickling; print('Python AI-201 stack OK')"
# Go
go version | grep -E "go1\.(2[1-9]|[3-9][0-9])"
# Ollama models
ollama list | grep -E "llava|whisper"
# DVLA
test -f ~/virtus-dvla/requirements.txt && echo "DVLA repo cloned OK"
# Promptfoo
promptfoo --version
All checks should pass. Bring blockers to the first lab session.
Tool Journal: AI-201 originating entries
Record the first entry for each tool below when you complete the corresponding module. See AI-101 §SETUP for the Toolchain Diary format.
| Tool | Module first used | Entry prompt |
|---|---|---|
| MITRE ATLAS Navigator | 1 | Record the 3 ATLAS tactics you mapped your AI-101 capstone findings to |
| LangChainGo | 4 | Record the LangChainGo version, vulnerable function, and patch commit |
| Gonja | 4 | Record the template engine, SSTI injection pattern, and affected version |
| fickling | 3 | Record the first pickle opcode you inspected and what it does |
| pickletools | 3 | Record the output structure of pickletools.dis() on a safe pickle |
| Virtus DVLA | 2 | Record the DVLA version, the finding you reproduced, and which of the 9 models it affects |
| Promptfoo | 2, 11 | Record the provider config used + the number of test cases you ran |
| HarmBench | 4.5 | Record the attack method, behavior category, and ASR (attack success rate) measured |
| JailbreakBench | 4.5 | Record the jailbreak you evaluated and the JBB score it received |
| Microsoft PyRIT (multi-turn) | 5 | Record the Crescendo / TAP / Skeleton-Key strategy used + the number of turns to jailbreak |
| MITRE ATLAS Navigator (advanced) | 1, capstone | Record the tactic navigator layer you used for the capstone engagement map |
| Microsoft AI Red Teaming Playground Labs | 9 | Record the scenario you ran and what defensive control it tested |
| LLaVA (local) | 7.5 | Record the model version, the image prompt you used, and the output |
| faster-whisper | 7.5 | Record the model variant (base/small/medium) and the attack payload injected |
| multi-model regression runner | 11 | Record the 9 models tested and each model's result on your finding |
| OpenSSF Scorecard | 3, 4 | Record the supply-chain score for LangChain + LangChainGo |