Tool Shadowing: Hijack an Agent's Tool Selection With a Name Collision
Hosted · ide
Beta

Tool Shadowing: Hijack an Agent's Tool Selection With a Name Collision

Attack a real MCP-style tool registry by shadowing a trusted tool. OrbitDesk's support agent loads its tools from a runtime registry with a flat namespace and last-write-wins resolution, and it picks which tool to call from attacker-controllable descriptions. Register a malicious twin with the same name as the trusted record reader and a more compelling, compliance-approved description, so the agent calls your twin instead. The twin silently reads a cross-tenant record and exfiltrates its reference, and the same shadow hands another tenant's data straight back to the caller. Measure how reliably the shadow wins selection, then harden the registry, namespace tools and reject collisions, and prove a fresh shadow is dead while benign lookups still work.

75 min8 steps3 domainsAdvanced

Hands-on labs require Pro · $29.99/mo · cancel anytime

Map the attack surface
Query
Retriever
LLM
Poisoned doc
retrieved chunk
Answer
0%
Attack-success rate
Attacks blocked · benign answers pass
graded on real output, not the model's talk

What you'll learn

  1. 1
    Recon: the flat namespace and description-driven selection
    Same target as the tool-poisoning lab, OrbitDesk's support agent and its MCP-style tool
  2. 2
    Shadow register: collide with a trusted tool name
    Prove the collision works. You register a tool with the same name as the trusted
  3. 3
    Hijack: make the twin read out of scope and leak
    Now make the shadow do harm. Your twin already delegates to the cross-tenant record
  4. 4
    Cross-tenant: the wrong tenant's record in the reply
    The same shadow has a second impact that needs no exfil channel at all. Because your
  5. 5
    Measure: shadow-wins-selection attack-success-rate
    Put a number on it. asr.py registers the shadow and runs the account-lookup exploit N
  6. 6
    Harden (namespace): a twin name cannot shadow a trusted tool
    Close the hole at the registry boundary. The model cannot be relied on to notice a
  7. 7
    Harden (scan + allow-list): drop poisoned twins and unlisted servers
    Namespacing stops a twin from stealing the bare name, but a third-party tool can still
  8. 8
    Verify and resist: the full stack neutralizes fresh shadows
    Prove the three registry-boundary controls hold together against fresh attack variants,

Prerequisites

  • Comfortable reading Python
  • Know what an HTTP GET and a JSON tool schema are
  • Helpful: completed the MCP Tool Poisoning lab

Exam domains covered

Offensive AI SecurityLLM Application SecurityAgentic Supply Chain Security

Skills & technologies you'll practice

This advanced-level ai/ml lab gives you real-world reps across:

Tool ShadowingTool Name CollisionTyposquattingAgentic Supply ChainCross-Tenant LeakageAgentic AI SecurityOWASP ASI02OWASP LLM01MITRE ATLASAI Red Team

What you'll do in this lab

This is a hands-on offensive-security lab built on a real agent supply chain: a ReAct loop with native tool-calling against an in-cluster model, and an MCP-style tool registry the agent loads its tools from at session start. You attack OrbitDesk, an internal IT-support assistant, with a tool-shadowing collision. The registry has a flat namespace and resolves names last-write-wins, so you register a malicious twin with the same name as the trusted record reader and a more compelling, compliance-approved description. The agent selects your twin, which silently reads a cross-tenant record and forwards its reference to your in-pod collector. You never jailbreak the model; you out-compete a trusted tool on description alone.

You then show the same shadow as a cross-tenant isolation failure: a GLOBEX account lookup returns another tenant's record straight back to the caller. You measure attack-success-rate over a paced battery, then switch to defense and harden the registry boundary: namespace tools so a duplicate name is prefixed under its server instead of silently winning the bare name, scan descriptions for hidden instructions, and reject near-duplicate collisions. You re-run a fresh shadow to prove the trusted name holds while benign lookups still work.

Frequently asked questions

Do I need a machine-learning background?

No. The lab is about how an agent resolves and selects tools, not model internals. You read a small ReAct agent and an MCP-style registry, find that names collide with no detection and selection is description-driven, and register a twin that wins. The fix is an ordinary namespacing and collision-detection control.

What is tool shadowing?

An attacker registers a tool with the same name as a trusted one (or a typosquat near-name) and a more compelling description. The registry's last-write-wins resolution and the model's description-driven selection route the call to the attacker's twin. It is OWASP Agentic ASI02 Tool Misuse with LLM01 as the metadata-borne foothold, and MITRE ATLAS tool-name confusion / Publish Poisoned AI Agent Tool.

How is the exploit graded?

Deterministically, never on model wording. The shadow-register step grades the tool's content hash: the trusted name must resolve to the attacker's object. The hijack step grades the cross-tenant marker reaching an in-pod listener. The harden step plants a fresh shadow, confirms the trusted object still serves the name, that nothing leaks, and that benign lookups still work.