How agents
actually remember.
The infographic making the rounds on LinkedIn — short-term, long-term, episodic, semantic as four peer categories — is taxonomically wrong. The canonical structure is one working memory and three long-term subtypes. And one of them is almost always missing.
The taxonomy is a tree, not a row.
Memory in an agent isn't four parallel boxes. It's a hierarchy: one short-term store that lives in the context window, and three long-term subtypes that persist across sessions. Drawing them as peers is the original sin of agent-memory diagrams.
Memory is one slice of the agent loop.
Memory diagrams in isolation make memory look like the whole system. It isn't. In production, memory is read and written inside a continuous loop alongside perception, reasoning, planning, tool use, and reflection. The four memory types describe what is stored. The loop describes when it's read and written.
Working memory is loaded at every turn. Long-term memory is queried selectively — semantic for facts, episodic for past attempts, procedural for "how we do things here." Writes happen on the way out, often through a reflection or critique step.
- REACT
- Thought → Action → Observation, iteratively. The dominant single-agent pattern.
- PLAN-AND-EXECUTE
- Generate a full plan, then execute steps. For long-horizon tasks where premature action is costly.
- REFLEXION
- Self-critique after each attempt, write the lesson to memory. The procedural-memory writer.
- MCP
- Model Context Protocol — the connector standard for tools and resources in 2026.
Working Memory.
(LangGraph)
+ Message Buffer
(PostgresSaver)
+ Tool Calls
Thread-scoped state. Lives in the context window and a checkpointer — not a memory database. Cleared at end of thread.
Working memory is what the agent currently has in mind. In LangGraph it's literally the graph state — message history, intermediate tool results, scratchpad reasoning — all of which fits in the context window for the duration of one thread. A checkpointer (PostgresSaver, RedisSaver, SqliteSaver) persists this state so an interrupted run can resume, but the scope is still one thread, one task.
A common error is conflating the checkpointer with long-term memory. Checkpointers handle reliability and time-travel debugging. Long-term memory persists knowledge across threads and users. Both are needed in production. Neither replaces the other.
- Implementation
- Context window ·
BaseMessage[]· Redis · Postgres - Best for
- Live conversation · multi-step task state · resumability
Semantic Memory.
Hybrid Search
Knowledge Graph
Grounding
Retrieved
Response
Facts, definitions, and relationships. What things are, independent of when they were learned.
Semantic memory is the agent's model of how the world works. User preferences ("Harnoor uses LangGraph"), domain facts ("Toronto is in Ontario"), business rules ("orders over $10k require approval") — anything that's a stable assertion about reality lives here. It's queried by similarity (vector search) or by relationship (graph traversal), often both.
The 2026 production trend is hybrid retrieval: dense embeddings plus BM25 keyword matching plus entity matching, all fused into one score. Pure vector similarity alone is no longer the default — graph memory has matured into the production stack.
- Implementation
- Vector DB · knowledge graph · Mem0 ·
LangMem - Best for
- Personalization · factual grounding · policy lookup
Episodic Memory.
/ Episode
Action · Outcome
/ Episode Store
Adaptation
Retrieved
Temporal Query
(case-based)
Response
Past interactions, situated in time. What happened, when, what worked.
Episodic memory stores experience. Each episode is a record — context, timestamp, action taken, outcome — and the agent retrieves similar past episodes to inform current decisions. This is case-based reasoning: instead of reasoning from first principles every time, the agent asks "have I been here before, and what worked?"
Implementation typically combines vector similarity (find episodes like this one) with temporal filters (recent episodes weighted higher). Zep's Graphiti and Letta's recall memory are the current production references; LangMem stores episodes as few-shot examples that get injected into prompts at decision points.
- Implementation
- Zep · Graphiti · Letta recall · time-indexed DB
- Best for
- Adaptive behavior · learning from failure · few-shot routing
Procedural Memory.
+ Output
Eval Signal
System Prompt
Optimizer
Memory Store
(improved)
Improvement
How the agent operates — rules, workflows, and self-updating system prompts. The memory almost every infographic forgets.
Procedural memory is the agent's skill set — its workflows, decision policies, and the system prompts that shape its behavior. Unlike semantic memory (facts about the world) or episodic memory (events that happened), procedural memory is how the agent does things: routing rules, error-recovery heuristics, tone, format, escalation policies.
The 2026 production pattern, codified by LangMem, treats system prompts as procedural memory and lets the agent rewrite them based on feedback. A Reflexion step evaluates the run, a prompt optimizer rewrites the instruction, and the next iteration runs with the improved prompt. This is how agents get measurably better without retraining the underlying model.
- Implementation
- LangMem procedural · self-updating system prompts · Letta core blocks
- Best for
- Continuous improvement · learned routing · agent persona
The systems that actually implement this.
Six systems cover the production memory landscape in 2026. The right choice depends on whether you own the agent loop, whether memory is load-bearing, and how much control you need over the schema.
BaseStore and exposes semantic, episodic, and procedural memory through a unified API. My default for teams already on LangGraph.
PostgresSaver, RedisSaver, SqliteSaver — these persist graph state for resumability and time-travel debugging. Thread-scoped, not user-scoped. Do not confuse with long-term memory.
One working memory.
Three long-term subtypes.
No more, no less.
- Sumers et al. — Cognitive Architectures for Language Agents (CoALA) · arXiv:2309.02427 · 2023
- LangChain · LangMem SDK documentation — semantic, episodic, procedural memory
- Mem0 · State of AI Agent Memory 2026 · benchmarks & multi-signal retrieval
- Letta · Benchmarking AI Agent Memory: Is a Filesystem All You Need?
- IBM · MongoDB · Redis — production memory architecture guidance, 2026
- Atlan · Long-Term Memory in LangChain Agents · framework comparison