Build a Structured-Output Extraction & Review Pipeline with Claude
Extract validated, typed JSON from messy text using Claude's tool_use with a JSON schema: explicit criteria, few-shot examples, a validation-and-retry loop, nullable and enum fields for uncertainty, a multi-instance review pass for unbiased scoring, and a Batch API path for bulk. Submit a single script or notebook for instant, rubric-based feedback.
3 hrs
Est. time
4
Outcomes
8
Rubric criteria
65%
Pass score
What you'll learn
Skills you'll have real reps in after shipping this.
See how it works
Why structured output beats parsing
class Person(BaseModel):
name: str
age: int{ "name": str, "age": int },
strict, no extra keys{
"name": "Dana",
"age": 31
}Forcing a tool call with a schema makes the model return typed fields directly, instead of prose you have to parse and repair.
Schema-shaped extraction
{
"type": "object",
"required": ["id", "customer", "items", "total"],
"properties": {
"id": { "type": "string" },
"customer": {
"type": "object",
"required": ["name"],
"properties": { "name": { "type": "string" } }
},
"items": {
"type": "array",
"items": {
"type": "object",
"required": ["sku", "quantity", "unit_price"],
"properties": {
"sku": { "type": "string" },
"quantity": { "type": "integer", "minimum": 1 },
"unit_price": { "type": "number" }
}
}
},
"total": { "type": "number" }
}
}{
"id": "ord_a1f8",
"customer": { "name": "Maria Chen" },
"items": [
{ "sku": "TS-001", "quantity": 2, "unit_price": 24.00 },
{ "sku": "MG-002", "quantity": 1, "unit_price": 12.00 }
],
"total": 60.00
}{▮The schema is the contract: nullable fields and enums let the model represent uncertainty honestly rather than inventing values.
The scenario
You need to turn a pile of free-text records (support emails, invoices, or resumes) into clean structured rows for a database. A first attempt prompts Claude for JSON and parses the reply, but it returns prose half the time, guesses values it should leave blank, and there is no way to know how accurate it is.
You are going to build it properly: force structured output with a schema, give the model explicit extraction criteria and a couple of examples, validate and retry on bad output, represent uncertainty honestly, and measure accuracy. Then make it cheap at scale with the Batch API.
Your role
You are a Claude solutions architect building a production extraction service. Your deliverable is one module that extracts schema-valid JSON reliably, handles uncertainty honestly, measures its own accuracy, and scales cheaply.
Start the task to unlock the full brief
You'll get the step-by-step requirements, setup commands, the 8-criterion grading rubric, tips, and the ability to submit your solution for instant AI grading.
Free to start · submit when you're ready
Learning resources
What you'll build in this structured-output task
This is a build-and-submit task, not a guided lab. You build a production extraction service on Claude: structured output forced through tool_use with a JSON schema, guided by explicit criteria and a few examples, validated and retried, with uncertainty represented honestly and accuracy measured. The deliverable is one Python module you could point at a real backlog of records.
The techniques here are the ones that make extraction trustworthy. You stop parsing free-text JSON and force a schema-shaped tool call, you write precise criteria instead of vague guidance, you validate and retry, you use nullable fields and enums so the model does not fabricate values, you run a multi-instance review to estimate confidence, and you add a Batch API path so bulk work is roughly half the cost.
Grading is rubric-based and explainable. Your submission is scored against weighted criteria (SDK integration, structured tool_use, criteria and few-shot, validation and retry, nullable and enum, the Batch API path, multi-instance review, and accuracy) with per-criterion feedback quoted from your code. The pass threshold is 65 percent and you can resubmit. These are the prompt-engineering and structured-output skills the Claude Certified Architect exam tests.