Inter-Agent Injection: Propagate a Morris II Worm Across a Two-Agent Graph
Attack a real two-agent support graph where one agent's output is the next agent's input with no authentication and no validation. Plant a self-replicating payload in the only untrusted input, an inbound customer email the Intake agent ingests. Intake forwards it, and the Resolver agent, trusting the inter-agent notes as instructions, performs the attacker-directed action AND re-emits the payload verbatim: a second-hop cascade and the Morris II replication primitive, bounded to two hops. The harness attributes every side effect to the agent that caused it, so you prove the second agent executed. Measure propagation reliability, then harden the channel, a schema-constrained, validated, replication-aware handoff, and prove the cascade is contained while benign tickets still resolve.
Hands-on labs require Pro · $29.99/mo · cancel anytime
What you'll learn
- 1Recon: map the two-agent graph and the trust boundaryOrbitDesk runs a two-agent graph for support cases:
- 2First hop: inject via the email and confirm intake carries itYou cannot talk to the agents directly. But Agent A ingests an inbound customer email,
- 3Propagate: make the replication clause survive the handoffA worm needs two things: an action and a way to copy itself forward. You have the action
- 4Second hop: confirm the resolver executes (the cascade)This is the payoff. The payload entered ONLY through Agent A's email. If Agent B now
- 5Measure: propagation reliability across runsTwo hops means two compounding compliance events: Agent A must forward the worm AND Agent
- 6Harden 1: a schema-constrained, data-only handoffYou proved the cascade: a payload that entered only through Agent A's email reached Agent
- 7Harden 2: validation and replication detectionThe schema handoff is the load-bearing fix, but a robust channel does not rely on a single
- 8Verify and resist: a battery of fresh worm variants is containedOne worm dying is not proof. A hardened channel has to hold against variants you did not
Prerequisites
- Comfortable reading Python
- Know what an HTTP GET is and how two services pass messages
- Helpful: completed the Indirect Prompt Injection and MCP Tool Poisoning labs
Exam domains covered
Skills & technologies you'll practice
This advanced-level ai/ml lab gives you real-world reps across:
What you'll do in this lab
This is a hands-on offensive-security lab built on a real two-agent graph: an Intake agent and a Resolver agent, each a ReAct tool-user against an in-cluster model, wired so one agent's output becomes the next agent's input. The handoff has no authentication, no provenance check, and no validation, and the Resolver consumes the inter-agent notes as instructions. You plant a self-replicating payload in the only untrusted input, an inbound customer email the Intake agent ingests. Intake forwards it, and the Resolver performs the attacker-directed action and re-emits the payload verbatim. You never talk to the Resolver directly; the injection reaches it across the agent boundary.
The harness attributes every side effect to the agent that caused it, with per-agent sentinels and a tagged callback channel, so you prove the SECOND agent executed and not just the first. You measure propagation reliability across paced runs, then switch to defense and harden the inter-agent channel: constrain the handoff to a structured, data-only schema so the Resolver cannot execute free-text prose, add output validation between hops as a circuit breaker, and add replication detection so a re-emitted instruction block is stripped. You re-run a fresh worm to prove the cascade is contained while benign tickets still resolve.