Module 1: From OWASP to ATLAS -- Production-Pentest Scoping · AI-201

Duration: 2 hr lecture + 3 hr lab + 5 hr independent
Lab: Lab 1 (ATLAS Navigator: map AI-101 findings to ATLAS tactics)
MITRE ATLAS tactics: Reconnaissance + Resource Development
Foundational weave: Mitchell Ch 7 (On Trustworthy AI); Karpathy makemore Video 1

1.1 What Changed Between AI-101 and AI-201

AI-101 gave you a map of the attack surface. You learned ten categories, reproduced one CVE, and built a threat model. That is Belt-3 work: awareness and initial contact.

AI-201 starts where AI-101 ends. The question shifts from "what are the attack classes?" to "how do I run a pentest engagement against an agentic system?" The difference is methodological: a OWASP-list walk tells you what to look for; an engagement methodology tells you how to scope, execute, document, and hand off the work.

The mechanism of that shift is a framework change. AI-101 used OWASP LLM Top 10 (2025) as its structuring reference -- a 10-item list designed for introductory awareness. AI-201 uses MITRE ATLAS as its structuring reference -- a 100+ technique knowledge base designed for practitioners who run real engagements.

The two frameworks are not replacements for each other. OWASP LLM Top 10 is a checklist; MITRE ATLAS is a taxonomy. Belt-3 work uses the checklist to identify what is present. Belt-4 work uses the taxonomy to map what happened, compare it to historical cases, and communicate it to other practitioners in a shared vocabulary.

1.2 MITRE ATLAS: Structure and Scale

MITRE ATLAS (Adversarial Threat Landscape for Artificial-Intelligence Systems) is modeled after MITRE ATT&CK, the canonical framework for traditional IT adversary behavior. ATLAS targets the AI/ML-specific layer: adversarial inputs, model theft, data poisoning, supply chain attacks on ML components.

Current version: v5.1.0, November 2025.

Scale:

16 tactics (the what -- the goal of an adversary action)
84 techniques (the how -- a specific method to achieve a tactic goal)
56 sub-techniques (variants within a technique)
32 mitigations (defensive controls)
42 real-world case studies

October 2025 expansion: ATLAS v5.1.0 added 14 new techniques specifically for autonomous AI agents and generative AI systems, in collaboration with Zenity Labs. This expansion covers the agentic attack surface that did not exist when ATLAS was first published: agent hijacking, tool-calling exploits, RAG-poisoning patterns.

Distinction from ATT&CK: ATT&CK covers traditional IT adversary behavior (credential theft, lateral movement through networks, persistence via registry keys). ATLAS covers the ML-specific surface that ATT&CK does not model: adversarial examples, model extraction, training-data poisoning, prompt injection as an initial-access vector.

The two frameworks are complementary. A real agentic-system attack may span both: the initial access vector is ATLAS (LLM prompt injection), but the subsequent lateral movement within the cloud environment uses ATT&CK techniques. Belt-4 practitioners need both.

1.3 The 16 ATLAS Tactics

Read each tactic and its definition now. These are the columns in the ATLAS Navigator and the vocabulary of every AI-201 module.

Tactic	Definition	AI-201 module(s)
Reconnaissance	Gather information about the target AI/ML system	1
Resource Development	Acquire resources for attack development	1
Initial Access	Gain initial foothold in the ML pipeline	3, 7.5
ML Attack Staging	Prepare materials for an ML-specific attack	2, 4.5
ML Model Access	Gain access to the target ML model	2
Execution	Run adversarial payloads	4, 5
Defense Evasion	Avoid detection by safety/monitoring systems	4.5
Discovery	Map the target system's capabilities	5
Lateral Movement	Move from one component to another	5
Collection	Gather data from the target system	6
Persistence	Maintain foothold across sessions	6
Command and Control	Control compromised components	7
Exfiltration	Extract data from the target system	6
Impact	Affect the availability, integrity, or confidentiality of the target	7, 7.5
Mitigations	Defensive controls	8, 9

There is no "Privilege Escalation" tactic in ATLAS v5.1.0; escalation in LLM systems is modeled under Execution and Lateral Movement.

1.4 Production-Pentest Scoping

A pentest engagement has a defined beginning and end. Before the engagement begins, three documents exist: a rules of engagement (RoE) document, a scope definition, and a threat model. Without these three, the engagement has no boundary -- and an unbounded engagement is not a pentest, it is an ad hoc exploration.

For agentic systems, scoping is harder than for traditional software. The attack surface is dynamic: a new tool added to the agent changes the scope. The execution environment is often non-deterministic: the same input produces different outputs. The adversary model is novel: an attacker who can influence the LLM's context can, in principle, influence any downstream action the LLM takes.

Scoping checklist for agentic systems:

What model is the agent using? (Foundation model, version, deployment -- API vs local)
What tools does the agent have access to? (List every function-calling capability, every external service it can write to)
What data does the agent read? (RAG corpus, file system, database, web)
What data does the agent write? (File writes, API calls, email sends, code execution)
What trust boundaries exist? (Where does attacker-controlled data enter? Where does the output go?)
What is the agent's execution environment? (Sandboxed? Cloud? Same credentials as the human operator?)
What would a successful attack look like? (Data exfiltration? Code execution? SSRF? Service disruption?)

This checklist maps directly to ATLAS Reconnaissance (gather information about 1-6) and defines the Impact scope (7).

1.5 Mitchell Weave: Trustworthy AI as Deployment Property

Mitchell's Chapter 7 (On Trustworthy AI) makes an argument that is directly applicable to the scoping question: trustworthiness in AI systems is not a property a model has or does not have. It is a property a deployment context produces or fails to produce.

The pedagogical point: the same LLM weights behind two different deployment wrappers exhibit two different security postures. A model with strong safety training deployed behind a wrapper that injects attacker-controlled text into its system prompt is less trustworthy than the same model deployed with strict input sanitisation. The weights did not change.

This is what makes Module 10 (model-intrinsic vs application-layer findings) possible. A Belt-4 pentester walks into an engagement and asks: where did the wrapper authors draw the trust boundaries? The answer to that question determines which findings are exploitable in this deployment, even if the underlying model has been through the same safety training as every other deployment.

Scoping an agentic-system engagement is, operationally, the same as mapping the trust boundaries that the deployment drew -- and then asking whether those boundaries are where they need to be.

1.6 AI-201 Rules of Engagement

All AI-201 labs and the capstone operate under these standing rules:

You may only test systems you control, systems explicitly provided by the academy, or open-source systems where your testing does not affect other users of the same deployment.
For the capstone, you may test an open-source application running locally on your own infrastructure. You may not test a hosted instance of any application without written permission from the operator.
CVE reproductions in Labs 3 and 4 use pinned vulnerable versions in an isolated environment. Do not test production deployments.
For the Module 2 DVLA testbed: all current DVLA findings are model-intrinsic. No external coordination is required. Do not attempt to find new findings in production deployments outside the DVLA.
Any finding you discover outside the designated lab scope must be reported to the academy coordinator before disclosure. Do not publish independently.

These rules are not just policy; they are the Belt-4 discipline. A practitioner who runs unauthorized tests on production systems is not doing a pentest -- they are committing unauthorized access. The difference matters legally and professionally.

1.7 From OWASP Map to ATLAS Map

Your AI-101 capstone produced a threat model for a LangChain agent mapped to OWASP LLM Top 10 entries. Module 1's lab re-examines that threat model through the ATLAS lens.

The mapping is not one-to-one. One OWASP entry may correspond to multiple ATLAS techniques. One ATLAS technique may underlie multiple OWASP entries. The ATLAS representation is more precise: it gives a technique ID, a definition, and a set of documented real-world case studies that used the same technique.

Example mappings:

OWASP LLM01:2025 Prompt Injection → ATLAS AML.T0051.000 (LLM Prompt Injection), AML.T0051.001 (Indirect Prompt Injection), AML.T0055 (LLM Jailbreak)
OWASP LLM03:2025 Supply Chain → ATLAS AML.T0010 (ML Supply Chain Compromise), AML.T0018 (Backdoor ML Model)
OWASP LLM06:2025 Excessive Agency → ATLAS AML.T0054 (LLM Plugin Compromise), AML.T0066 (Privilege Abuse)

By the end of Module 1, every finding from your AI-101 threat model should have at least one ATLAS technique mapped to it. That mapping is the foundation of the AI-201 capstone report.