Build & submit taskBetaintermediate

Architect a Claude Agentic Loop with Safe Termination

Build a single-agent support resolver on Claude's tool-use loop with explicit termination conditions, an escalation boundary that hands risky actions to a human, and graceful failure handling so it can never loop forever. Submit a single script or notebook for instant, rubric-based feedback.

3 hrs

Est. time

Outcomes

Rubric criteria

65%

Pass score

What you'll learn

Skills you'll have real reps in after shipping this.

The agentic loop

Drive Claude by reading stop_reason and returning tool_result blocks until the model ends its turn.

Termination conditions

An agent needs both a soft signal (the model decides it is done) and a hard cap (iteration limit) so it cannot spin.

Escalation boundaries

Decide up front which actions are autonomous and which require a human, and enforce it in code, not in the prompt alone.

Fail safe

Tool errors return structured results to the model and never become an infinite retry.

See how it works

The agentic loop

the ReAct loop

3/8

THOUGHTI need the user's order total. I should look up their latest order.

ACTIONget_orders(user="u_42", limit=1)

OBSERVATION[{"id": "o_991", "status": "shipped", "total": null}]

Thought, action, observation, repeat. ReAct interleaves reasoning and acting. The model thinks out loud about what to do, takes one action, and reads the observation that comes back, and crucially that observation feeds the next thought. Here the first lookup returns a null total, so the agent reasons that it needs the line items, fetches them, sums them, and only then answers. A model that only reasoned would have guessed; a model that only acted could not adapt. Alternating the two is what makes it an agent.

Claude alternates between reasoning and tool calls. Each turn you read stop_reason and decide whether to run a tool and continue, or stop.

Why a step budget matters

the loop guard

continue

laps used4 / 10

steps

cost used$0.8 / $2.0

$ cost

run another lap

Finished beats a limit; a limit beats forever. Check the conditions before each lap. A finished answer ends the run even on the last allowed lap, you want the answer, not a limit error, so it is checked first. Otherwise the cost ceiling and the step cap each halt a stuck agent, and they use a "reached the cap" boundary so a budget of N stops at N. Without this guard an agent retrying a failing tool calls the model every lap forever, which is the single most common way an agent run turns into a runaway bill.

Without a hard iteration cap an agent can loop indefinitely on a hard ticket. A budget plus a completion signal guarantees the loop ends.

The scenario

You are adding an autonomous support agent to a SaaS product. It reads a customer ticket, calls tools to look up the account and order, and either resolves the issue or drafts a reply. Leadership loves the demo, but two things make them nervous: an agent that keeps calling tools in a loop with no stop condition, and an agent that issues a refund it was never authorized to make.

Your job is to architect the loop properly. It must know when it is done, refuse to take high-risk actions on its own, and fail safe when a tool errors out instead of spinning.

Your role

You are a Claude solutions architect. Your deliverable is one well-structured agent module a teammate could trust in production: a correct tool-use loop, hard termination guarantees, a clear autonomous-vs-human boundary, and failure handling, all demonstrated end to end.

Start the task to unlock the full brief

You'll get the step-by-step requirements, setup commands, the 6-criterion grading rubric, tips, and the ability to submit your solution for instant AI grading.

Free to start · submit when you're ready

Learning resources

Anthropic API: tool use

The tool-use request/response shape and the stop_reason loop.

docs.anthropic.com

Anthropic API: messages

Messages API, content blocks, and stop reasons.

docs.anthropic.com

Building effective agents

When to use loops, and how to keep them controllable.

anthropic.com

What you'll build in this Claude agentic-loop task

This is a build-and-submit task, not a guided lab. You architect a single Claude agent the way it has to run in production: a real tool-use loop that reads stop_reason and feeds tool_result blocks back, with hard termination guarantees and an escalation boundary that keeps high-risk actions out of the agent's hands. The deliverable is one Python file you could drop into a real service.

The focus is the part most demos skip: control. You enforce a maximum-iteration cap and a completion signal so the loop cannot spin, you decide in code which actions are autonomous and which require a human, and you make tool failures fail safe instead of triggering an infinite retry. You then prove it on three tickets, including an escalation and a failure case.

Grading is rubric-based and explainable. Your submission is scored against weighted criteria (SDK integration, loop correctness, termination, escalation, failure handling, and the demonstration) and returns per-criterion feedback with evidence quoted from your code. The pass threshold is 65 percent and you can resubmit. These are the agentic-architecture skills the Claude Certified Architect exam tests.

Frequently asked questions

Which model should I use?

Any Claude model your key can reach. A small, fast Claude model is perfect for this since the loop and tools are lightweight. The rubric rewards the loop architecture, not the model choice.

Why both a cap and a completion signal?

The completion signal (the model ending its turn or calling a finish tool) is the normal path. The hard iteration cap is the guarantee: even if the model never decides it is done, the loop still stops. Production agents need both.

How is escalation graded?

The grader checks that you define an autonomous-vs-human boundary and enforce it in code (the high-risk tool refuses and escalates), and that your demonstration includes a ticket that crosses the boundary and is escalated rather than executed.

What counts as a complete submission?

A single .py or .ipynb on the Anthropic SDK that runs a tool-use loop with a hard cap and completion signal, enforces an escalation boundary, handles tool failures safely, and demonstrates three tickets including an escalation and a failure or max-iteration case.