Build & submit taskBetaintermediate

Manage Long-Running Claude Context with Compaction

Keep a long-running Claude session inside its context window: budget tokens, compact older history into summaries as the window fills, offload durable state to memory files, recall relevant detail on demand, and degrade gracefully on oversized inputs. Submit a single script or notebook for instant, rubric-based feedback.

3 hrs

Est. time

Outcomes

Rubric criteria

65%

Pass score

What you'll learn

Skills you'll have real reps in after shipping this.

Context budgeting

Track how full the window is and act before it overflows, instead of crashing or silently truncating.

Compaction

Summarize older turns and keep recent ones verbatim, so a long session stays inside the window without losing the thread.

External memory

Durable facts belong in a memory file that survives compaction, not in a message history that gets summarized away.

Graceful degradation

When an input is too big, chunk or triage it rather than letting a single request overflow the window.

See how it works

Compaction in action

Summary-buffer window

130 / 120 tok

budget

120 tok

system 📌You are a helpful project assistant.20

summary of 4 older turns≈24 tok

Dana (Acme) is migrating to Postgres by the Helsinki deadline, budget 8000; Carlos flagged clause 12.

userCarlos from legal flagged clause 12.34

assistantUnderstood, clause 12 is a risk.22

userWhat did we decide on the database?30

Keep recent, summarize the rest. The system message is always pinned and the latest turns stay verbatim. As the budget tightens, the oldest turns fold into one faithful summary, facts preserved, tokens reclaimed.

When the window fills, older turns are summarized into a compact note and recent turns are kept verbatim, so the session stays bounded without losing the thread.

Durable memory beyond the window

short-term vs long-term memory

Context windowin the prompt, this run

this task's goal + state

recent tool results

recent conversation

full, older turns fall off the edge

speed

instant (already in prompt)

size

bounded by the window

lifetime

gone when the run ends

The window for now; a store for forever. Short-term memory is the context window: whatever is in the prompt this lap, instant to use but bounded in size and erased when the run ends. Long-term memory is an external store you read from and write to, durable and effectively unlimited, at the cost of a fetch (often a similarity search) to pull the relevant piece into the prompt. You reach for the window for the active task and the store for anything that must outlive the run or exceed its size. Either way, agent memory is mostly a database problem.

Facts that must survive compaction live in a memory file, re-injected as a compact preamble, rather than in a history that gets summarized away.

The scenario

Your assistant holds long conversations and works through large documents. After enough turns it either crashes when the context overflows or quietly forgets the early part of the session. A naive fix (keep only the last N turns) throws away decisions made earlier that still matter.

You are going to manage the context deliberately: track how full the window is, compact older turns into summaries before it overflows, keep durable facts in a memory file, and pull the right detail back when it is needed, so the session can run indefinitely without losing the thread.

Your role

You are a Claude solutions architect responsible for long-running context. Your deliverable is one module that keeps a session inside the window through budgeting, compaction, and external memory, without losing important state.

Start the task to unlock the full brief

You'll get the step-by-step requirements, setup commands, the 7-criterion grading rubric, tips, and the ability to submit your solution for instant AI grading.

Free to start · submit when you're ready

Learning resources

Anthropic: context windows

How the context window works and how to manage it.

docs.anthropic.com

Anthropic API: token counting

Counting tokens to budget the window.

docs.anthropic.com

Long context tips

Working effectively with long context.

docs.anthropic.com

What you'll build in this context compaction task

This is a build-and-submit task, not a guided lab. You keep a long-running Claude session inside its context window without crashing or forgetting: budget the tokens, compact older history into summaries as the window fills, offload durable facts to memory, and degrade gracefully when an input is too big. The deliverable is one Python module that can run a session indefinitely.

The skills here are what make a long-lived assistant reliable. You track how full the window is and act before it overflows, you summarize old turns while keeping recent ones verbatim, you persist the facts that must survive into a memory file, and you chunk oversized inputs instead of letting a single request blow the window. Then you prove the session still remembers what matters.

Grading is rubric-based and explainable. Your submission is scored against weighted criteria (SDK integration, context budgeting, compaction, external memory, graceful degradation, recall, and the demonstration) with per-criterion feedback quoted from your code. The pass threshold is 65 percent and you can resubmit. These are the context-management skills the Claude Certified Architect exam tests.

Frequently asked questions

How is this different from the reliability task?

The reliability task covers lost-in-the-middle, crash recovery, and provenance. This one is about keeping a long-running session inside the context window: budgeting, compacting history, and external memory. Together they cover the context-and-reliability domain.

What is compaction?

Summarizing older conversation turns into a compact note when the window approaches its limit, while keeping the most recent turns verbatim. It lets a session run indefinitely without overflowing or dropping early context wholesale.

Why a separate memory file?

Compaction summarizes history, which can blur or drop specific facts. Durable facts and decisions belong in a memory file that survives compaction and is re-injected as a compact preamble, so they are never lost.

What counts as a complete submission?

A single .py or .ipynb on the Anthropic SDK that budgets tokens against the window, compacts older history into summaries, offloads durable state to a memory file, recalls detail on demand, degrades gracefully on oversized input, and demonstrates a long session that keeps working after compaction.