Classroom Glossary Public page

AI-201: AI & Agentic Security II -- Course Outline

1,302 words

Course Code: VCA-AI-201
Track position: Part-III AI & Agentic Security Track, Module 2 of 3
Prerequisites: VCA-AI-101 (OWASP LLM Top 10 fluency; CVE-2025-65106 reproduction; garak + PyRIT first contact) + VCA-PEN-101 (recommended)
Belt: 4/5 Deep Technical
Duration: ~12 weeks (~140 hr: ~26 lec / ~50 lab / ~64 indep)
Credential: VCA-AI-201 Certificate of Completion


Mission

AI-201 moves from individual-vulnerability awareness (AI-101's OWASP LLM Top 10 frame) to production-pentest discipline. Students scope agentic-system engagements, identify trust boundaries that production deployments consistently get wrong, build defensible reproduction tools, write coordinated-disclosure-quality reports, and reason about cross-language bug-class generalisation. The two signature CVEs (CVE-2025-68664 LangGrinch CVSS 9.3 + CVE-2025-9556 LangChainGo Gonja SSTI) demonstrate that canonical agentic-system bugs generalise: a Python deserialization pattern reappears in Go; a Jinja2 SSTI pattern reappears in Gonja.

The structuring frame is MITRE ATLAS (Adversarial Threat Landscape for AI Systems; v5.1.0, November 2025; 16 tactics + 84 techniques + 56 sub-techniques + 32 mitigations + 42 case studies). Where AI-101 anchors to OWASP LLM Top 10 (a 10-item introductory list), AI-201 anchors to MITRE ATLAS (a 100+ technique knowledge base that gives Belt-4 adversarial-AI work the same shared vocabulary that ATT&CK gives classical-pentest work).


Foundational Anchors

Primary continuation pair (from AI-101, advanced chapters):

Anchor Track role Chapters
Melanie Mitchell, Artificial Intelligence: A Guide for Thinking Humans (FSG, 2019) Narrative anchor at intermediate depth; calibrated-skepticism + fairness + trustworthy-AI themes Ch 7-13 (NLP / reasoning / game-playing / transfer learning / ethics)
Andrej Karpathy, Neural Networks: Zero to Hero (YouTube + GitHub) Build-it-yourself substrate companion; proves transformer internals makemore Videos 1-2 + nanoGPT video

New addition at AI-201:

Anchor Track role Chapters
Brian Christian, The Alignment Problem (Norton, 2020) Forward-pointer into AI-301; Prophecy section introduces value-alignment substrate Ch 1-4 (Prophecy section)

Primary papers (required reading at Belt-4; not optional):

Paper Authors Why required
"Universal and Transferable Adversarial Attacks on Aligned Language Models" (arxiv 2307.15043, 2023) Zou, Wang, Carlini, Nasr, Kolter, Fredrikson The GCG foundational paper; universal adversarial suffixes; transferability result
"AutoDAN: Generating Stealthy Jailbreak Prompts on Aligned LLMs" (ICLR 2024; arxiv 2310.04451) Liu et al. Hierarchical genetic algorithm; semantically meaningful (vs GCG gibberish); PPL-bypass
"Jailbreaking Black Box LLMs in Twenty Queries" (arxiv 2310.08419, 2023) Chao, Robey, Dobriban, Hassani, Pappas, Wong PAIR; black-box LLM-as-attacker; <20 queries median; no gradient access
"HarmBench: A Standardized Evaluation Framework for Automated Red Teaming" (ICML 2024) Mazeika et al., UC Berkeley + Google DeepMind + Center for AI Safety THE standardized eval framework; 400 behaviors across 7 risk categories; de facto standard

Petzold note: Petzold CODE is the CSA-track anchor; it does not appear in AI-201. No Petzold weaves are used here.


Module Map

Module Topic MITRE ATLAS tactic Lecture Lab Indep
1 From OWASP to ATLAS. Production-pentest scoping Reconnaissance + Resource Development 2 hr 3 hr 5 hr
2 The Virtus DVLA testbed ML Model Access + ML Attack Staging 2 hr 4 hr 5 hr
3 Pickle / cloudpickle / dill deserialization in agentic systems Initial Access (ML Supply Chain Compromise) 2 hr 4 hr 5 hr
4 Cross-language SSTI: the bug-class generalisation Execution (Tool-Chain Compromise) 2 hr 4 hr 5 hr
4.5 The 2023-2026 academic jailbreak corpus Defense Evasion (Adversarial Examples + Model Bypass) 3 hr 4 hr 7 hr
5 Tool-calling exploit patterns Discovery + Lateral Movement 2 hr 4 hr 5 hr
6 RAG-poisoning + indirect prompt injection at scale Persistence + Collection 2 hr 4 hr 5 hr
7 Agentic web-scraping + SSRF in LLM-rendered URLs Command and Control (LLM-Mediated C2) 2 hr 3 hr 5 hr
7.5 Multi-modal adversarial attacks Initial Access (Multi-Modal Adversarial Inputs) 2 hr 4 hr 6 hr
8 Adversarial robustness testing with HarmBench at scale Defense Evasion (Evade ML Model — AML.T0015) 2 hr 3 hr 5 hr
9 Agentic memory and persistent instruction injection Persistence (AML.TA0008) + Defense Evasion 2 hr 3 hr 5 hr
10 LLM-powered threat intelligence automation Reconnaissance (adversarial misuse) 2 hr 3 hr 5 hr
11 The D8 methodology in depth — full comparative evaluation ATLAS evaluation methodology 3 hr 4 hr 6 hr
12 Capstone: coordinated disclosure or defensive pipeline build Full ATLAS spectrum 2 hr 6 hr 4 hr
Total 29 hr 52 hr 73 hr = ~154 hr

Lab Index

Lab Module Title Substrate Points
1 1 ATLAS Navigator: map AI-101 findings to ATLAS tactics Pyodide + web 8
2 2 DVLA clone + L3-regression reproduction Local Python + Ollama 10
3 3 CVE-2025-68664 LangGrinch end-to-end reproduction Local Python 15
4 4 CVE-2025-9556 LangChainGo Gonja SSTI reproduction Go 1.21+ 15
4.5 4.5 GCG / AutoDAN / PAIR adversarial-suffix lab Cloud GPU (Colab/Kaggle) 12
5 5 Permissive tool agent + agency-confusion exploit Local Python 10
6 6 Poisoned vector store + document-loader exfiltration chain Local Python 12
7 7 SSRF via LLM-generated URLs Local Python 10
7.5 7.5 Visual prompt injection + Whisper transcription-chain attack Local + Ollama 12
8 8 HarmBench 50-behavior evaluation: ASR by category + regression suite Local + Ollama 25
9 9 Memory-persistence injection via conversation history Local Python 20
10 10 Automated CVE triage and ATLAS enrichment pipeline Local Python + NVD API 20
11 11 Multi-model D8 evaluation: 3 models, full OL/PR/W scorecard Local + Ollama 25
12 12 Capstone: coordinated disclosure report OR defensive pipeline build Written + Local 50
Total 222 pts

MITRE ATLAS v5.1.0 Reference Map

The 16 ATLAS tactics mapped to AI-201 modules:

ATLAS Tactic AI-201 Module(s)
Reconnaissance 1
Resource Development 1
Initial Access 3, 7.5
ML Attack Staging 2, 4.5
ML Model Access 2
Execution 4, 5
Defense Evasion 4.5
Discovery 5
Lateral Movement 5
Collection 6
Persistence 6
Command and Control 7
Exfiltration 6
Impact 7, 7.5
Mitigations 8, 9
Evaluation Methodology 10, 11

The MITRE ATLAS Navigator (atlas.mitre.org/navigator) is the canonical exploration tool. Lab 1 uses it directly. The capstone requires an ATLAS-mapped pentest report.


Three-Layer Classification: What AI-201 Adds Over AI-101

AI-101 provided Belt-3 awareness: identify the OWASP entry, reproduce one CVE, run one automated tool. AI-201 adds Belt-4 depth across three layers:

Layer 1: Methodology. Where AI-101 teaches individual attack classes, AI-201 teaches the engagement discipline: scoping, trust-boundary mapping, reproduction documentation, coordinated-disclosure writing, multi-model regression testing.

Layer 2: Academic vocabulary. Where AI-101 names OWASP entries, AI-201 requires fluency in the primary research literature: GCG / AutoDAN / PAIR / HarmBench / MITRE ATLAS. A Belt-4 graduate can read a 2024 jailbreak paper and map its contribution to their engagement methodology.

Layer 3: Cross-language generalisation. Where AI-101 covers one SSTI CVE (CVE-2025-65106 in Python LangChain), AI-201 maps the bug class across Go (LangChainGo Gonja), JavaScript (LangChainJS Eta), and Java (LangChain4J FreeMarker). Cross-language analysis is a Belt-4 skill: "this bug class exists in N implementations; which one does my target use?"


Hardware and Equipment

No additional hardware beyond AI-101 setup. Additional software:

  • Go 1.21+ (for CVE-2025-9556 reproduction)
  • Ollama with LLaVA and Whisper models (for Module 7.5)
  • faster-whisper (pip install faster-whisper) for local Whisper inference
  • Free-tier cloud GPU: Google Colab / Kaggle Kernels / HuggingFace Spaces / Lightning AI (for Module 4.5 GCG lab)
  • Access to virtus-llm-owasp public repo (DVLA testbed)

Total additional cost estimate: ~$5-15 in cloud API usage for adversarial-suffix exploration. All other tools are free or already installed from AI-101.


Assessment Summary

Two signature CVE reproductions (gates): CVE-2025-68664 (Lab 3) + CVE-2025-9556 (Lab 4) must both produce working exploits before Tier-2 scoring applies.

Tier 2 scoring (40/30/30):

  • 40%: Reproduction depth (CVEs land; GCG lab produces valid adversarial suffix; DVLA findings reproduced)
  • 30%: Coordinated-disclosure report quality (Lab 8 walkthrough + capstone report structure)
  • 30%: Cross-language bug-class generalisation articulation (written analysis in Lab 4 + Module 10)

B- minimum on Tier 2 (>= 70%) required for the VCA-AI-201 Certificate of Completion.


Version History

v0.1 (initial): Modules 1-7.5 (9 modules) with full lecture notes and labs.

v0.2: Modules 8-12 complete. Full 12-module course with all lectures, labs, instructor guide entries, and portal deployment. D8 ollama-cloud-operator-trial methodology (9-model / 47-session / OL/PR/W evaluation) is the Module 11 anchor; primary source at sandhillscto/research/ollama-cloud-operator-trial.md.