ptrnsai

Long-Term Memory

Intermediate💾 Memory PatternsIndustry practice

Intent

Persist information across sessions so agents can remember user preferences, past interactions, and learned knowledge.

Problem

LLMs are stateless — each conversation starts from scratch. An agent that helped you yesterday has no memory of it today. For personalized assistants, ongoing projects, or any agent that interacts with users repeatedly, this amnesia is a fundamental limitation.

Solution

Store important information extracted from conversations in a durable store (database, vector DB, or key-value store). Before each interaction, retrieve relevant memories and inject them into the context. After each interaction, extract and store new memories. Memory types: User preferences (likes/dislikes), Factual knowledge (learned facts), Interaction history (what happened before), Task context (ongoing project state).

Diagram

Session 1: User says preference → [Extract] → [Store in Memory DB]
                                                          ↓
Session 2: New conversation → [Retrieve relevant memories] → [Inject into context]
                                                          ↓
           Agent responds with personalized awareness of past sessions

When to Use

  • Personal assistants that interact with users repeatedly
  • Project management agents that track ongoing work
  • Customer service that should remember past interactions
  • Any agent that benefits from personalization

When NOT to Use

  • Single-shot, anonymous interactions
  • When privacy requirements prohibit storing user data
  • Short-lived tasks with no need for continuity

Pros & Cons

Pros

  • Enables personalization and continuity across sessions
  • Agent improves over time as it accumulates knowledge
  • Users don't have to repeat context
  • Can store both explicit knowledge and learned preferences

Cons

  • Memory management is complex (what to store, when to forget)
  • Stale or incorrect memories can degrade performance
  • Privacy and security concerns with stored data
  • Retrieval accuracy affects memory usefulness

Implementation Steps

  1. 1Choose a memory store: vector DB, SQL, key-value, or hybrid
  2. 2Design the memory extraction pipeline: what information gets stored?
  3. 3Implement memory retrieval: given a new conversation, what memories are relevant?
  4. 4Build memory injection into the prompt template
  5. 5Add memory maintenance: updating outdated info, removing contradictions
  6. 6Implement privacy controls: what can be remembered, data retention policies

Real-World Example

Personalized Coding Assistant

Over weeks, the assistant learns: the user prefers TypeScript over JavaScript, uses React with Zustand for state management, follows a specific naming convention, and is working on project 'Atlas.' Future interactions automatically incorporate these preferences without being asked.

PythonPersistent Memory Across Sessions
from openai import OpenAI
import json
from pathlib import Path

client = OpenAI()
MEMORY_FILE = Path("memory.json")

def load_memories() -> dict:
    if MEMORY_FILE.exists():
        return json.loads(MEMORY_FILE.read_text())
    return {"preferences": {}, "facts": []}

def save_memories(memories: dict):
    MEMORY_FILE.write_text(json.dumps(memories, indent=2))

def extract_and_store(conversation: str):
    memories = load_memories()
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": "Extract user preferences and facts. Return JSON: {\"preferences\": {}, \"facts\": []}"},
            {"role": "user", "content": conversation},
        ],
    )
    new = json.loads(response.choices[0].message.content)
    memories["preferences"].update(new.get("preferences", {}))
    memories["facts"].extend(new.get("facts", []))
    save_memories(memories)

def chat_with_memory(message: str) -> str:
    memories = load_memories()
    return client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": f"User context: {json.dumps(memories)}"},
            {"role": "user", "content": message},
        ],
    ).choices[0].message.content

References