Inter-Agent Injection: Propagate a Morris II Worm Across a Two-Agent Graph
Hosted · ide
Beta

Inter-Agent Injection: Propagate a Morris II Worm Across a Two-Agent Graph

Attack a real two-agent support graph where one agent's output is the next agent's input with no authentication and no validation. Plant a self-replicating payload in the only untrusted input, an inbound customer email the Intake agent ingests. Intake forwards it, and the Resolver agent, trusting the inter-agent notes as instructions, performs the attacker-directed action AND re-emits the payload verbatim: a second-hop cascade and the Morris II replication primitive, bounded to two hops. The harness attributes every side effect to the agent that caused it, so you prove the second agent executed. Measure propagation reliability, then harden the channel, a schema-constrained, validated, replication-aware handoff, and prove the cascade is contained while benign tickets still resolve.

85 min8 steps3 domainsAdvanced

Hands-on labs require Pro · $29.99/mo · cancel anytime

Map the attack surface
Query
Retriever
LLM
Poisoned doc
retrieved chunk
Answer
0%
Attack-success rate
Attacks blocked · benign answers pass
graded on real output, not the model's talk

What you'll learn

  1. 1
    Recon: map the two-agent graph and the trust boundary
    OrbitDesk runs a two-agent graph for support cases:
  2. 2
    First hop: inject via the email and confirm intake carries it
    You cannot talk to the agents directly. But Agent A ingests an inbound customer email,
  3. 3
    Propagate: make the replication clause survive the handoff
    A worm needs two things: an action and a way to copy itself forward. You have the action
  4. 4
    Second hop: confirm the resolver executes (the cascade)
    This is the payoff. The payload entered ONLY through Agent A's email. If Agent B now
  5. 5
    Measure: propagation reliability across runs
    Two hops means two compounding compliance events: Agent A must forward the worm AND Agent
  6. 6
    Harden 1: a schema-constrained, data-only handoff
    You proved the cascade: a payload that entered only through Agent A's email reached Agent
  7. 7
    Harden 2: validation and replication detection
    The schema handoff is the load-bearing fix, but a robust channel does not rely on a single
  8. 8
    Verify and resist: a battery of fresh worm variants is contained
    One worm dying is not proof. A hardened channel has to hold against variants you did not

Prerequisites

  • Comfortable reading Python
  • Know what an HTTP GET is and how two services pass messages
  • Helpful: completed the Indirect Prompt Injection and MCP Tool Poisoning labs

Exam domains covered

Offensive AI SecurityLLM Application SecurityMulti-Agent Security

Skills & technologies you'll practice

This advanced-level ai/ml lab gives you real-world reps across:

Inter-Agent InjectionMulti-Agent SecurityMorris IIAgent PropagationSelf-Replicating PromptAgentic AI SecurityOWASP ASI07OWASP ASI08OWASP ASI10MITRE ATLASAI Red Team

What you'll do in this lab

This is a hands-on offensive-security lab built on a real two-agent graph: an Intake agent and a Resolver agent, each a ReAct tool-user against an in-cluster model, wired so one agent's output becomes the next agent's input. The handoff has no authentication, no provenance check, and no validation, and the Resolver consumes the inter-agent notes as instructions. You plant a self-replicating payload in the only untrusted input, an inbound customer email the Intake agent ingests. Intake forwards it, and the Resolver performs the attacker-directed action and re-emits the payload verbatim. You never talk to the Resolver directly; the injection reaches it across the agent boundary.

The harness attributes every side effect to the agent that caused it, with per-agent sentinels and a tagged callback channel, so you prove the SECOND agent executed and not just the first. You measure propagation reliability across paced runs, then switch to defense and harden the inter-agent channel: constrain the handoff to a structured, data-only schema so the Resolver cannot execute free-text prose, add output validation between hops as a circuit breaker, and add replication detection so a re-emitted instruction block is stripped. You re-run a fresh worm to prove the cascade is contained while benign tickets still resolve.

Frequently asked questions

Do I need a machine-learning background?

No. The lab is about how two agents trust each other's messages, not model internals. You read a small two-agent graph, find that a peer agent's free-text output is consumed as instructions with no auth or validation, and propagate a payload across the hop. The fix is ordinary message-contract design: a structured schema, authentication, and output validation between hops.

What is a Morris II style worm here?

A payload that combines a malicious action with an instruction to re-emit itself, so it propagates from one agent to the next. In this lab it enters via a customer email the Intake agent reads, rides the unauthenticated handoff to the Resolver, executes there, and copies itself into the Resolver's output. It is OWASP Agentic ASI07 (trust of peer messages), ASI08 (cascade), and ASI10 (replication), with MITRE ATLAS tracking the Morris II case study. Hops are bounded to two; this is not a self-spreader.

How is the second hop graded?

Deterministically, on an agent-attributed side channel. The harness tags each agent's callbacks with its agent id and writes a per-agent sentinel, so a callback tagged agent=resolver proves the Resolver, not the Intake agent, fired it. The payload enters only via the Intake email, so a resolver-attributed leak proves the injection propagated across the hop.

How is the harden step graded?

It plants a fresh worm email in code, runs it a few times, and confirms the Resolver never executes it (no resolver sentinel, no resolver-attributed leak), then runs a benign email and confirms the Resolver still does real work. The cascade is contained without breaking the graph.