Episodic Memory

Advanced💾 Memory PatternsStanford (Park et al., 2023)

Intent

Store and retrieve specific past experiences — complete interaction episodes — so the agent can learn from what worked and what didn't.

Problem

General long-term memory stores facts and preferences but loses context. Knowing 'the user prefers Python' is different from remembering 'last Tuesday when we tried to debug that async issue, we discovered the problem was in the event loop configuration.' Specific episodes contain rich context that general summaries lose.

Solution

Store complete interaction episodes (or rich summaries of them) with temporal and contextual metadata. When a new situation resembles a past episode, retrieve the relevant experience and use it to inform the current approach. Episodic memory mimics how humans recall 'that time when...' — specific experiences that provide directly applicable lessons.

Diagram

Past Episode: [Context: debugging async] [Actions: tried X, Y, Z] [Outcome: Z worked]
                                    ↓ [Store with metadata]
                              [Episodic Memory DB]
                                    ↓ [Retrieve when similar context detected]
New Episode: [Context: another async issue] → [Retrieved: 'Last time, Z worked'] → Informed response

When to Use

Agents that handle recurring types of tasks and should improve over time
When past problem-solving strategies are directly applicable to new situations
Support agents that encounter similar issues repeatedly
Long-running projects where past decisions inform future ones

When NOT to Use

Truly novel tasks with no relevant past episodes
When storage and retrieval overhead isn't justified
Privacy-sensitive contexts where interaction logs can't be stored

Pros & Cons

Pros

Rich, contextual learning from past experiences
Can recall specific strategies that worked
Improves agent performance over time on recurring tasks
Provides explainable reasoning ('I'm doing X because it worked last time')

Cons

Storage costs grow over time
Retrieval relevance is hard to get right
Past episodes may not be applicable to new contexts
Privacy implications of storing detailed interaction logs

Implementation Steps

1Define what constitutes an 'episode' worth storing
2Extract key elements: context, actions taken, outcome, lessons learned
3Store episodes with rich metadata (timestamps, tags, success/failure)
4Build similarity matching: given a new situation, find relevant past episodes
5Inject relevant episodes into the agent's context
6Implement episode curation: merge similar episodes, archive old ones

Real-World Example

DevOps Incident Response

An agent handling production incidents stores each incident as an episode: symptoms, investigation steps, root cause, resolution. When a similar alert fires, it retrieves relevant past incidents: 'This pattern matches the memory leak incident from January — check the connection pool configuration first.'

PythonStore and Retrieve Past Incidents

from openai import OpenAI
import numpy as np
from datetime import datetime

client = OpenAI()
episodes = []
episode_embeddings = []

def store_episode(description: str, outcome: str, tags: list[str]):
    episode = {
        "timestamp": datetime.now().isoformat(),
        "description": description,
        "outcome": outcome,
        "tags": tags,
    }
    episodes.append(episode)
    embedding = client.embeddings.create(
        model="text-embedding-3-small", input=[description]
    ).data[0].embedding
    episode_embeddings.append(embedding)

def recall(situation: str, top_k: int = 3) -> list[dict]:
    query_emb = np.array(client.embeddings.create(
        model="text-embedding-3-small", input=[situation]
    ).data[0].embedding)
    similarities = np.array(episode_embeddings) @ query_emb
    top_idx = np.argsort(similarities)[-top_k:][::-1]
    return [episodes[i] for i in top_idx]

# Store a past incident
store_episode("Redis connection pool exhausted", "Increased max connections to 50", ["redis", "perf"])
# Later, recall similar
similar = recall("Database connections timing out during peak traffic")

References

Generative Agents — Park et al., 2023