Insecure Output Handling: SSRF, SQLi, and Command Execution Through an Agent's Tools
Hosted · ide
Beta

Insecure Output Handling: SSRF, SQLi, and Command Execution Through an Agent's Tools

Red-team OpsBot, a ReAct tool-using support agent, by shaping the arguments it passes to its own tools. File a poisoned support ticket and a benign on-call query turns into a server-side request forgery against a metadata endpoint, a SQL injection that drops a canary and reads another tenant's rows, and code execution through a transform helper. Measure attack-success-rate, then close every sink: an allow-listed fetch, parameterized queries, and a removed code tool.

80 min8 steps3 domainsAdvanced

Hands-on labs require Pro · $29.99/mo · cancel anytime

Map the attack surface
Query
Retriever
LLM
Poisoned doc
retrieved chunk
Answer
0%
Attack-success rate
Attacks blocked · benign answers pass
graded on real output, not the model's talk

What you'll learn

  1. 1
    Recon: map the agent and its tools
    You are red-teaming OpsBot (DV-ToolAgent), ACME Cloud's internal operations
  2. 2
    SSRF: make the agent fetch the internal metadata endpoint
    Drive your first sink: server-side request forgery (SSRF) through http_fetch. The
  3. 3
    SQLi: read another tenant and drop the canary
    Drive the second sink: SQL injection through db_query. The tool runs a
  4. 4
    Reliability: measure the SSRF ASR
    One lucky tool call is a demo. A finding needs an attack-success-rate (ASR): how
  5. 5
    Code execution: run the transform helper
    Drive the third sink: code execution through run_python. The tool runs an
  6. 6
    Harden (SSRF): close the fetch sink with an egress allow-list
    Switch hats. You proved three sinks: SSRF, SQL injection, and code execution. The
  7. 7
    Harden (SQLi): parameterized, tenant-scoped queries
    The SSRF sink is closed (Step 6). Now close the second sink: SQL injection through
  8. 8
    Harden (RCE) + verify: remove the code tool and resist all three sinks
    Two sinks are closed: SSRF (Step 6) and SQLi (Step 7). Close the last one, then

Prerequisites

  • Comfortable reading Python
  • Know what SSRF, SQL injection, and a shell command are
  • No ML background required

Exam domains covered

Offensive AI SecurityLLM Application SecurityAgentic Security

Skills & technologies you'll practice

This advanced-level ai/ml lab gives you real-world reps across:

Insecure Output HandlingSSRFSQL InjectionCommand ExecutionTool MisuseAgentic SecurityOWASP LLM05AI Red Team

What you'll do in this lab

This is a hands-on offensive-security lab on insecure output handling at the tool-call sink (OWASP LLM05, with LLM06 excessive agency as the enabler). You attack OpsBot, a working ReAct tool-using agent, by treating its tool arguments as the output you control. You file one poisoned support ticket, and when an on-call engineer asks the agent to work it, the agent reads your text and calls a tool with your argument. You drive three real sinks: server-side request forgery that reaches an in-pod cloud-metadata endpoint, SQL injection that drops an audit canary and reads another tenant's records, and code execution through a transform helper that writes a sentinel file. Every callback and side effect is a deterministic in-pod oracle, so you see the exploit fire for real against a live model, not a mock.

You do not jailbreak the model. You influence the arguments it passes to a tool it is supposed to call, which is why refusal rates stay low: "summarize this status URL", "look up this customer", and "run this transform helper" are all in-distribution helpful behavior. After measuring how reliably the channel fires across realistic on-call prompts, you switch hats and close every sink on the sink side: an egress allow-list with explicit private-range and metadata denial, parameterized tenant-scoped queries that forbid stacked statements, and removing the arbitrary-code tool entirely, while a legitimate fetch and a legitimate customer lookup still work.

Frequently asked questions

Do I need to know machine learning to do this lab?

No. You need to read Python and understand SSRF, SQL injection, and a shell command. The lab is about how an agent passes model-built arguments to real interpreters, not about model internals. Everything model-specific is explained inline.

What is insecure output handling in an agent?

It is the class of bug where an agent passes model output (here, tool arguments) to an interpreter without validating it. A fetch tool with no allow-list becomes SSRF, a query tool that string-formats model output becomes SQL injection, and a code tool that executes model-authored strings becomes remote code execution. It is OWASP LLM05:2025, and LLM06 excessive agency is what makes the tools dangerous enough to matter.

Is the SSRF against a real cloud metadata endpoint?

The metadata target is an in-pod stand-in on 127.0.0.1:9092. A real 169.254.169.254 fetch does not route inside the lab pod, so the lab teaches the SSRF pattern (no host allow-list, reaches loopback and the metadata service) against that loopback stub, and the instructions say so plainly.

Does the exploit rely on a jailbreak or a leaked system prompt?

No. The system prompt is an ordinary support-agent prompt with no secret. The exploit shapes the arguments the model passes to tools it already exposes. Each individual action looks legitimate, which is exactly why an aligned model complies and why the fix has to live at the sink.