Question 1

What's the difference between a tool call and a function call in the OpenAI schema?

Accepted Answer

They're the same concept; the schema is just layered. A chat completion that uses tools returns `message.tool_calls`, a list of objects each containing a `type: "function"` wrapper and a `function` object with `name` and `arguments` (a JSON string). "Function calling" is the older name — the API was originally one function per call — and "tool calling" is the current name that generalizes to multiple callable tools per turn. NIM's OpenAI-compatible endpoints expose both fields, and this lab uses the modern `tool_calls` shape throughout.

Question 2

Why is function calling more reliable than prompting for JSON?

Accepted Answer

Because the schema is enforced on the server, not on the model's prose instincts. When you pass `tools=[...]` with a JSON Schema, the endpoint constrains generation to produce arguments that match the schema — required fields are present, types are correct, the output parses. Prompt-only extraction depends on the model having seen enough JSON in training to emit valid JSON for your specific shape, and it breaks on edge cases: trailing commas, unescaped quotes in names, markdown code fences, etc. Step 4 measures the gap concretely on a noisy test set.

Question 3

What does tool_choice control and when should I set it?

Accepted Answer

`tool_choice` tells the endpoint how aggressively to call tools. `"auto"` (the default) lets the model decide whether to emit a tool call or a text reply. `"required"` forces a tool call. `{"type": "function", "function": {"name": "save_contact"}}` forces a specific tool. Use `"auto"` for agents that may or may not need the tool. Use `"required"` when your extraction pipeline must produce structured output, as in Step 2 — you don't want the model to chat back "sure, here's the info" in prose.

Question 4

Why classify first, then pick a schema, instead of using one giant schema with all fields?

Accepted Answer

One fat schema with union types forces the model to reason about record type *and* field extraction in a single shot, and it makes your validation harder — you end up checking "if type == invoice, these fields should be present; if type == contact, these other fields." A two-step pipeline (classifier → specialized extractor) keeps each call focused on one job, lets you evolve each schema independently, and maps cleanly onto the `router → specialist tool` pattern real agents use. Step 3 walks you through both tools and the routing shim.

Question 5

What counts as a "valid" extraction in Step 4's comparison?

Accepted Answer

A dict that parses cleanly and contains every required field for the record type. An extraction is *broken* if `json.loads` fails, if a required field is missing or null, or if a type doesn't match (e.g., `phone` came back as a number instead of a string). The Step 4 harness counts valid vs broken for both the prompt-only path and the tools path over the same messy input set. The expected outcome is that tools mode maintains a near-perfect valid rate while prompt-only degrades visibly on the awkward inputs.

Question 6

Does every NIM model support function calling?

Accepted Answer

Most modern chat-completion NIMs do — the Llama 3.3 70B Instruct and Nemotron reasoning families used in these labs all accept `tools` and return `tool_calls`. But not every model in the NIM catalog exposes function calling: some older vision-language models (for example `meta-llama/llama-3.2-11b-vision-instruct`) will return `404 No endpoints found that support tool use` when you pass tools. The `vlm-visual-qa` lab explores that split explicitly; here, the contact-extraction models are all on the supported list.

Structured Output & Function Calling with NIM

What you'll learn

Prerequisites

Exam domains covered

Skills & technologies you'll practice

What you'll build in this function-calling lab

Frequently asked questions