Long-Term Memory
Intent
Persist information across sessions so agents can remember user preferences, past interactions, and learned knowledge.
Problem
LLMs are stateless — each conversation starts from scratch. An agent that helped you yesterday has no memory of it today. For personalized assistants, ongoing projects, or any agent that interacts with users repeatedly, this amnesia is a fundamental limitation.
Solution
Store important information extracted from conversations in a durable store (database, vector DB, or key-value store). Before each interaction, retrieve relevant memories and inject them into the context. After each interaction, extract and store new memories. Memory types: User preferences (likes/dislikes), Factual knowledge (learned facts), Interaction history (what happened before), Task context (ongoing project state).
Diagram
Session 1: User says preference → [Extract] → [Store in Memory DB]
↓
Session 2: New conversation → [Retrieve relevant memories] → [Inject into context]
↓
Agent responds with personalized awareness of past sessionsWhen to Use
- Personal assistants that interact with users repeatedly
- Project management agents that track ongoing work
- Customer service that should remember past interactions
- Any agent that benefits from personalization
When NOT to Use
- Single-shot, anonymous interactions
- When privacy requirements prohibit storing user data
- Short-lived tasks with no need for continuity
Pros & Cons
Pros
- Enables personalization and continuity across sessions
- Agent improves over time as it accumulates knowledge
- Users don't have to repeat context
- Can store both explicit knowledge and learned preferences
Cons
- Memory management is complex (what to store, when to forget)
- Stale or incorrect memories can degrade performance
- Privacy and security concerns with stored data
- Retrieval accuracy affects memory usefulness
Implementation Steps
- 1Choose a memory store: vector DB, SQL, key-value, or hybrid
- 2Design the memory extraction pipeline: what information gets stored?
- 3Implement memory retrieval: given a new conversation, what memories are relevant?
- 4Build memory injection into the prompt template
- 5Add memory maintenance: updating outdated info, removing contradictions
- 6Implement privacy controls: what can be remembered, data retention policies
Real-World Example
Personalized Coding Assistant
Over weeks, the assistant learns: the user prefers TypeScript over JavaScript, uses React with Zustand for state management, follows a specific naming convention, and is working on project 'Atlas.' Future interactions automatically incorporate these preferences without being asked.
from openai import OpenAI
import json
from pathlib import Path
client = OpenAI()
MEMORY_FILE = Path("memory.json")
def load_memories() -> dict:
if MEMORY_FILE.exists():
return json.loads(MEMORY_FILE.read_text())
return {"preferences": {}, "facts": []}
def save_memories(memories: dict):
MEMORY_FILE.write_text(json.dumps(memories, indent=2))
def extract_and_store(conversation: str):
memories = load_memories()
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": "Extract user preferences and facts. Return JSON: {\"preferences\": {}, \"facts\": []}"},
{"role": "user", "content": conversation},
],
)
new = json.loads(response.choices[0].message.content)
memories["preferences"].update(new.get("preferences", {}))
memories["facts"].extend(new.get("facts", []))
save_memories(memories)
def chat_with_memory(message: str) -> str:
memories = load_memories()
return client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": f"User context: {json.dumps(memories)}"},
{"role": "user", "content": message},
],
).choices[0].message.content