Build an MCP Manifest Verifier with Signing, Pinning, and an Allow-List
Build the defensive control for an agentic tool supply chain: a manifest verifier that checks an HMAC/hashlib signature on every tool definition, pins each approved tool to a content hash so a post-approval mutation is detected, and enforces a capability allow-list so unknown or shadowing tools never load. Start from a vulnerable MCP-style registry and three working PoCs (tool poisoning, a rug pull, and tool shadowing). Wire your verifier in front of the registry, re-run the PoCs to prove all three are now rejected, run a benign control to prove a legitimate signed and pinned tool is still accepted and usable, and write a short remediation rationale with the standards mapping. Submit a single script or notebook for instant, rubric-based feedback.
3 hrs
Est. time
5
Outcomes
6
Rubric criteria
65%
Pass score
What you'll learn
Skills you'll have real reps in after shipping this.
The scenario
You own platform security for an internal agent that loads its tools at runtime from an MCP-style registry: a list of tool definitions, each with a name, a description, and an input schema, that the agent pastes into the model's context and then calls. The offensive team already proved it is wide open. They handed you a tiny vulnerable registry and three working proofs-of-concept: a poisoned tool description that steers the agent, a rug pull that mutates an approved tool definition after review, and a shadowing tool that registers a colliding name to override a trusted one. The exploits are not your deliverable. They are the oracle that tells you whether your fix actually holds.
Your job is to build the control the offensive findings recommended: a verifier that sits between the registry and the agent. Nothing loads unless its definition carries a valid signature, matches the hash it was pinned to at approval time, and names a capability on the allow-list. You will run the three PoCs through the verifier and show each one rejected in the printed output, then run a benign signed and pinned tool through the same verifier and show it accepted and callable. The center of gravity of this task is the verifier, not the attack.
Your role
You are a platform security engineer hardening an agentic tool supply chain. Your goal is a single, self-contained file that builds the defensive control end to end: a manifest verifier (signature verification + hash pinning + a capability allow-list) wired in front of a vulnerable MCP-style registry, with the provided tool-poisoning, rug-pull, and tool-shadowing proofs-of-concept all shown rejected after your fix, and a legitimate signed and pinned tool still accepted and usable with no over-blocking.
Start the task to unlock the full brief
You'll get the step-by-step requirements, setup commands, the 6-criterion grading rubric, tips, and the ability to submit your solution for instant AI grading.
Free to start · submit when you're ready
Learning resources
What this task is
This is a build-and-submit defensive-security task, not a quiz about supply-chain security. You build the control: a manifest verifier that sits in front of an MCP-style tool registry and gates every tool definition on three independent checks, an hmac/hashlib signature over the canonical definition, a content-hash pin that detects any post-approval mutation, and a capability allow-list that rejects unknown or colliding tool names. The provided tool-poisoning, rug-pull, and tool-shadowing proofs-of-concept are the oracle: you re-run them through your verifier and show each one rejected, then run a benign signed and pinned tool and show it accepted and usable.
Agentic supply-chain attacks (OWASP LLM03:2025 Supply Chain, OWASP Agentic ASI tool-poisoning entries, and the MITRE ATLAS Publish Poisoned AI Agent Tool technique) are the mechanism behind real incidents in tool registries and plugin catalogs. The skill this task builds is the defensive counterpart to those exploits: treat a tool definition as a signed, pinned artifact, enforce an allow-list at the registry boundary, and prove with the attacker's own proofs-of-concept that the control holds while a legitimate tool still loads and runs.
Grading is rubric-based and explainable. Your submission is scored against weighted criteria (the verifier control built, all three PoCs rejected, the benign tool preserved, the control minimal and at the boundary, the remediation rationale and mapping, and a self-contained run) with per-criterion feedback. The pass threshold is 65 percent and you can resubmit.