Build & submit taskBetaintermediate

Design Tools and an MCP Server for a Claude Agent

Design a tight set of 4-5 well-described tools, drive selection with tool_choice (auto, any, and named forcing), expose them through a Model Context Protocol server with env-var secrets and isRetryable structured errors, then wire it into a Claude loop. Submit a single script or notebook for instant, rubric-based feedback.

3 hrs

Est. time

Outcomes

Rubric criteria

65%

Pass score

What you'll learn

Skills you'll have real reps in after shipping this.

Tool descriptions drive selection

Claude picks tools from their descriptions. Specific, action-oriented descriptions on a small tool set beat a large pile of vague ones.

tool_choice modes

auto lets the model decide, any forces some tool call, and naming a tool forces that exact one. Each fits a different situation.

MCP integration

An MCP server exposes tools and resources to any MCP client, with configuration and secrets kept out of the code.

Retryable errors

A structured error with an isRetryable flag lets the agent retry transient failures and give up on permanent ones.

See how it works

Why tool descriptions decide selection

naming and describing tools

TASK Cancel the user’s most recent subscription.

list_subscriptions()

Return the user’s active subscriptions, newest first.

cancel_subscription()model picked ✓

Cancel one subscription by id. Idempotent: cancelling an already-cancelled sub is a no-op.

refund_charge()

Issue a refund for a specific charge id.

Precise names and scoped descriptions let the model map the task straight to cancel_subscription. The idempotency note even tells it the call is safe to retry.

The description is the interface the model programs against. An agent does not see your code; it sees a list of tool names and descriptions and picks from them by reading. So the name and the one-line description are not documentation, they are the API the model calls, and most tool-selection failures are really naming failures. Name tools for exactly what they do, write descriptions that say what they take, return, and when to use them, and keep their scopes from overlapping. State invariants like idempotency, because the model will use what you tell it.

Claude chooses tools from their descriptions. A small set of specific, action-oriented descriptions selects far more reliably than many vague ones.

What MCP exposes

model context protocol (MCP)

integrations

3 + 3 = 6

add a new tool

one server, all agents get it

One standard interface instead of N×M glue. Without a standard, every agent integrates every tool with its own custom code, so three agents and three tools is nine bespoke integrations, and a new tool means wiring it into each agent. The Model Context Protocol puts tools behind one standard server interface, so any MCP-speaking agent can talk to any MCP tool server. Three plus three becomes six, and a new tool server is instantly usable by every agent. The leverage is ecosystem-wide: tools become portable instead of welded to one app.

An MCP server publishes tools and resources to any client, with configuration and secrets supplied at the boundary rather than baked into code.

The scenario

Your Claude agent keeps picking the wrong tool. The descriptions are vague, there are fifteen tools competing for attention, and when a tool hits a transient network error the agent has no idea whether to retry. On top of that, a teammate hardcoded an API key into a tool so they could ship faster.

You are going to fix the tool layer. A small set of tools with descriptions Claude can actually act on, the right tool_choice for each situation, an MCP server that exposes them cleanly with secrets pulled from the environment, and structured errors that tell the agent whether a failure is worth retrying.

Your role

You are a Claude solutions architect responsible for the tool and integration layer. Your deliverable is one module showing tool descriptions written for reliable selection, correct tool_choice usage, an MCP server with env-based secrets, and structured isRetryable errors.

Start the task to unlock the full brief

You'll get the step-by-step requirements, setup commands, the 7-criterion grading rubric, tips, and the ability to submit your solution for instant AI grading.

Free to start · submit when you're ready

Learning resources

Anthropic API: tool use

Tool definitions, input schemas, and tool_choice options.

docs.anthropic.com

Model Context Protocol

MCP servers, tools, and resources.

modelcontextprotocol.io

MCP Python SDK

FastMCP and the lower-level Server API.

github.com

What you'll build in this tool and MCP task

This is a build-and-submit task, not a guided lab. You design the tool and integration layer of a Claude agent the way it should be built: a small set of tools with descriptions Claude can act on, correct tool_choice usage, and a Model Context Protocol server that exposes them cleanly. The deliverable is one Python module that doubles as a reference for how to wire tools to Claude.

The details here are exactly the ones that separate a flaky agent from a reliable one. You cap the tool set at 4-5 and write descriptions for reliable selection, you use auto, any, and named tool_choice deliberately, you expose everything through an MCP server with secrets pulled from the environment, and you return structured errors with an isRetryable flag so the agent knows when to retry.

Grading is rubric-based and explainable. Your submission is scored against weighted criteria (SDK integration, tool design, tool_choice, MCP server, secret configuration, structured errors, and the selection demonstration) with per-criterion feedback quoted from your code. The pass threshold is 65 percent and you can resubmit. These are the tool and MCP skills the Claude Certified Architect exam tests.

Frequently asked questions

Why limit an agent to 4-5 tools?

Selection accuracy drops as the tool set grows and descriptions start to overlap. Four or five well-described tools per agent is the sweet spot. If you need more capability, split into subagents rather than overloading one.

When do I use each tool_choice mode?

auto when the model should decide whether and which tool to call, any when it must call some tool, and a named tool when the situation requires exactly one specific tool (for example forcing a structured-output tool).

Do I need a full MCP deployment?

No. A local MCP server built with the official SDK (FastMCP) that exposes your tools plus one resource is enough to demonstrate the pattern, including reading secrets from the environment.

What counts as a complete submission?

A single .py or .ipynb with 4-5 well-described tools, a demonstration of auto, any, and named tool_choice, an MCP server with a resource, env-based secrets, structured isRetryable errors, and a selection demonstration.