Working Memory

Basic💾 Memory PatternsAnthropic / Industry practice

Intent

Maintain a scratchpad of relevant information within a single task execution, actively curating what stays in the agent's context window.

Problem

Context windows are large but finite. In long-running tasks, the conversation fills up with tool outputs, intermediate results, and old messages. By the time the agent needs to make a final decision, the most important information may have been pushed out of context by irrelevant details.

Solution

Give the agent a structured scratchpad (working memory) that it actively manages. The agent writes key findings, decisions, and intermediate results to this scratchpad and reads from it when needed. Unlike raw conversation history, working memory is curated — only the most relevant information is kept. This is analogous to human working memory: the mental workspace where you hold the most relevant information for the task at hand.

Diagram

Agent executing long task...
    │
    ├── [Tool call 1] → result → [Extract key finding → Save to scratchpad]
    ├── [Tool call 2] → result → [Update scratchpad]
    ├── [Tool call 3] → result → [Scratchpad unchanged — not relevant]
    │
    └── [Final step: Read scratchpad → Synthesize answer]

    Scratchpad: { findings: [...], decisions: [...], open_questions: [...] }

When to Use

Long-running tasks with many intermediate steps
When context windows get filled with irrelevant tool outputs
Tasks that require maintaining awareness of multiple threads
Any agent that does more than a few tool calls

When NOT to Use

Short tasks that complete within a few turns
When the full conversation history fits comfortably in context

Pros & Cons

Pros

Keeps the most relevant information accessible
Prevents important context from being lost in long conversations
Agent actively decides what's worth remembering
Structured format aids systematic reasoning

Cons

Agent may store wrong information or miss important details
Adds complexity to the agent's decision-making
Scratchpad management takes tokens away from other use

Implementation Steps

1Define the scratchpad structure (key findings, decisions, open questions)
2Add scratchpad read/write tools to the agent's toolkit
3Instruct the agent to update the scratchpad after significant findings
4Include the current scratchpad state in the agent's system prompt
5Implement scratchpad compaction for very long tasks

Real-World Example

Research Agent

Agent researching 'impact of AI on employment' over 20+ tool calls. Scratchpad tracks: key statistics found, sources reviewed, conflicting claims, and the emerging thesis. When writing the final report, the agent reads the curated scratchpad rather than scrolling through 20 raw search results.

PythonAgent Scratchpad for Long Tasks

from openai import OpenAI
import json

client = OpenAI()

def research_with_scratchpad(topic: str, max_steps: int = 5) -> str:
    scratchpad = {"findings": [], "sources": [], "open_questions": [topic]}

    for step in range(max_steps):
        response = client.chat.completions.create(
            model="gpt-4o",
            messages=[
                {"role": "system", "content": "You are a research agent. Update the scratchpad after each finding."},
                {"role": "user", "content": f"Scratchpad:\n{json.dumps(scratchpad)}\n\nResearch the next open question. Return updated scratchpad as JSON."},
            ],
        )
        updates = json.loads(response.choices[0].message.content)
        scratchpad["findings"].extend(updates.get("findings", []))
        scratchpad["sources"].extend(updates.get("sources", []))
        scratchpad["open_questions"] = updates.get("open_questions", [])

        if not scratchpad["open_questions"]:
            break

    return client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": f"Summarize these research findings:\n{json.dumps(scratchpad['findings'])}"}],
    ).choices[0].message.content

References

Effective Context Engineering for AI Agents — Anthropic