Defend the Agent Supply Chain: Verify, Pin, and Capability-Gate Your Tool Registry
Harden a real MCP-style tool registry until a poisoned, rug-pulled, or shadowing tool manifest cannot reach the agent, in small single-concept steps. OrbitDesk's support agent loads its tools from a runtime registry and reads each tool description as trusted instruction text. You stand the registry up and trace one benign ticket, then reproduce three techniques one at a time: a poisoned tool description that hijacks the agent into leaking an account reference, a post-approval rug-pull mutation the registry serves with no pin, and a shadowing twin the flat namespace selects under a trusted tool name. You apply the obvious fix, a description blocklist, and watch a clean-description variant defeat it through its delegate. Then you build the durable control one mechanism per step: manifest signature verification, then hash pinning (rug-pull / change detection), then a per-tool capability allow-list with namespacing. You finish by proving that freshly planted unsigned, forged, mutated, and shadowing manifests are all refused while a legitimate signed and pinned tool stays admitted and usable, and an inter-agent worm's second hop is contained.
Hands-on labs require Pro · $29.99/mo · cancel anytime
What you'll learn
- 1Stand up OrbitDesk and trace one benign ticketYou are the defender for OrbitDesk, an internal IT-support assistant. It is a
- 2Reproduce: a poisoned tool description hijacks the agentYou have a working exploit in hand. Reproduce it so you know exactly what your
- 3Reproduce: the registry permits a rug pull (post-approval mutation)The poison in the previous step rode in at registration. The second technique is a
- 4Reproduce: the flat namespace permits a shadowing twinThe third technique is shadowing (typosquatting a tool name). The registry uses a
- 5Naive fix, bypassed: a description blocklist is not enoughThe poison rode in on its description, so the obvious reaction is to scan
- 6Build verifier mechanism 1: manifest signature verificationA description blocklist scans prose and cannot see provenance. Replace it with an allow
- 7Build verifier mechanism 2: hash pin (rug-pull detection)Mechanism 1 (signature verification) is carried forward in mcp_verifier.py. Now build
- 8Build verifier mechanism 3: capability allow-list + namespacingMechanisms 1 (signature) and 2 (pin) are carried forward. Now build the third
- 9Verify and resist: poison, rug pull, shadow, and worm all blocked; legit tools workYour verifier is now the gate the registry calls, with all three mechanisms in place:
Prerequisites
- Comfortable reading and editing Python
- Know what an HMAC signature and a content hash are
- Helpful: completed the MCP Tool Poisoning or Tool Shadowing labs
Exam domains covered
Skills & technologies you'll practice
This advanced-level ai/ml lab gives you real-world reps across:
What you'll do in this lab
This is a hands-on defensive-security lab built on a real agent supply chain: a ReAct loop with native tool-calling against an in-cluster model, and an MCP-style tool registry the agent loads its tools from at session start. You defend OrbitDesk, an internal IT-support assistant, by closing the one thing the agent trusts as much as code: a tool description. You start from a working exploit where an unsigned poisoned tool steers the agent into posting an account reference to an in-pod collector, then you harden the registry boundary so the same class of attack cannot fire.
You apply the obvious fix first, a description blocklist, and watch a clean-description variant defeat it by hiding the abuse in the tool's delegate, which a prose scan never inspects. Then you build the durable control: a tool-supply-chain verifier that does manifest signature verification so an unsigned or forged tool is refused, hash pinning so a post-approval rug pull no longer matches its pin, a per-tool capability allow-list so a tool cannot delegate to a record it was never approved to read, and inter-agent message validation so a peer agent's free-text output is treated as data and not instructions. You finish by proving freshly planted unsigned, forged, mutated, and shadowing manifests are all refused while a legitimate signed and pinned tool stays admitted and usable.