Exploit an Over-Privileged Agent, Then Re-Scope It to Least Privilege
Build a self-contained proof-of-concept that exploits a tool-using agent through excessive agency: deliver an authorized-looking foothold through an ingested ticket, drive a confused-deputy payee redirect a low-privilege caller is not entitled to, persist it by poisoning the agent's memory so it re-fires in a fresh session, show a benign baseline, then apply a least-privilege re-scope and re-run your own exploit to prove both paths are blocked while normal lookups still work. Write the finding the way a pentester would, with severity and an OWASP LLM06 / ASI02-ASI03 / ASI06 / MITRE ATLAS mapping. Submit a small project (or a single file) for instant, rubric-based feedback.
3 hrs
Est. time
4
Outcomes
7
Rubric criteria
65%
Pass score
What you'll learn
Skills you'll have real reps in after shipping this.
The scenario
You are on a red-team engagement against an internal tool-using agent. It holds one shared service credential for all users and exposes an over-broad database tool plus an unrestricted HTTP fetch tool. The rules of engagement are simple: you may not jailbreak the model or touch its weights, and you may not social-engineer staff. You can file an inbound ticket the agent ingests, the same foothold an attacker gets from a support ticket, a wiki edit, or an email the agent later processes.
Your lead wants more than 'the agent did something odd.' She wants a reproducible proof-of-concept that drives a privileged action as a low-privilege user (the confused deputy), then a least-privilege re-scope that you prove kills your own exploit while normal lookups still work. That deliverable, an exploit plus a fix you re-test, is this task.
Your role
You are an offensive security engineer auditing an agentic application. Your goal is a small, self-contained project that proves a confused-deputy exploit and a memory-poisoning re-fire end to end, states the impact and severity like a professional finding, and demonstrates a least-privilege remediation that defeats your own proof-of-concept.
Start the task to unlock the full brief
You'll get the step-by-step requirements, setup commands, the 7-criterion grading rubric, tips, and the ability to submit your solution for instant AI grading.
Free to start · submit when you're ready
Learning resources
What this task is
This is a build-and-submit offensive-security task, not a quiz about agent security. You produce a small project that proves an excessive-agency exploit end to end: a tool-using agent target, an indirect foothold delivered through an ingested ticket, a confused-deputy payee redirect a low-privilege caller is not authorized for, a memory-poisoning rule that re-fires the redirect in a fresh session, a benign baseline for contrast, and a least-privilege re-scope you re-run your own exploit against to prove it holds.
Excessive agency (OWASP LLM06, Agentic ASI02 Tool Misuse, ASI03 Identity and Privilege Abuse, ASI06 Memory and Context Poisoning, MITRE ATLAS AML.T0051) is the failure behind real agentic incidents: an over-broad tool, one shared identity, no authorization carried from the requesting user to the action, and a memory that persists a poisoned rule. The skill this task builds is the one that separates an AI red teamer from someone who can make a chatbot misbehave: drive a privileged action the caller was never entitled to, make it persist, then re-scope the system to least privilege and prove the fix ships.
Grading is rubric-based and explainable. Your submission is scored against weighted criteria (runnable PoC, confused-deputy exploit with memory persistence, indirect foothold, benign baseline, the tested least-privilege re-scope, the proven-dead re-test, and the written finding) with per-criterion feedback. The pass threshold is 65 percent and you can resubmit.