Field Notes · 001 / Agent Architecture

May 2026 8 min read Toronto, CA

How agents
actually remember.

The infographic making the rounds on LinkedIn — short-term, long-term, episodic, semantic as four peer categories — is taxonomically wrong. The canonical structure is one working memory and three long-term subtypes. And one of them is almost always missing.

Harnoor Singh

Sr. Cloud & AI Engineer · Symcor

001 / TAXONOMY

The taxonomy is a tree, not a row.

Memory in an agent isn't four parallel boxes. It's a hierarchy: one short-term store that lives in the context window, and three long-term subtypes that persist across sessions. Drawing them as peers is the original sin of agent-memory diagrams.

002 / SYSTEM CONTEXT

Memory is one slice of the agent loop.

Memory diagrams in isolation make memory look like the whole system. It isn't. In production, memory is read and written inside a continuous loop alongside perception, reasoning, planning, tool use, and reflection. The four memory types describe what is stored. The loop describes when it's read and written.

Working memory is loaded at every turn. Long-term memory is queried selectively — semantic for facts, episodic for past attempts, procedural for "how we do things here." Writes happen on the way out, often through a reflection or critique step.

REACT: Thought → Action → Observation, iteratively. The dominant single-agent pattern.
PLAN-AND-EXECUTE: Generate a full plan, then execute steps. For long-horizon tasks where premature action is costly.
REFLEXION: Self-critique after each attempt, write the lesson to memory. The procedural-memory writer.
MCP: Model Context Protocol — the connector standard for tools and resources in 2026.

003 / WM

Working Memory.

Working Semantic Episodic Procedural

01 · input

User Input

02 · state

Graph State
(LangGraph)

03 · context

Context Window
+ Message Buffer

06 · output

Response

05 · ckpt

Checkpointer
(PostgresSaver)

04 · reason

LLM Inference
+ Tool Calls

Thread-scoped state. Lives in the context window and a checkpointer — not a memory database. Cleared at end of thread.

Working memory is what the agent currently has in mind. In LangGraph it's literally the graph state — message history, intermediate tool results, scratchpad reasoning — all of which fits in the context window for the duration of one thread. A checkpointer (PostgresSaver, RedisSaver, SqliteSaver) persists this state so an interrupted run can resume, but the scope is still one thread, one task.

A common error is conflating the checkpointer with long-term memory. Checkpointers handle reliability and time-travel debugging. Long-term memory persists knowledge across threads and users. Both are needed in production. Neither replaces the other.

Implementation: Context window · BaseMessage[] · Redis · Postgres
Best for: Live conversation · multi-step task state · resumability

004 / LTM · SEMANTIC

Semantic Memory.

Working Semantic Episodic Procedural

01 · query

User Query

02 · embed

Embed +
Hybrid Search

03 · store

Vector DB +
Knowledge Graph

06 · reason

LLM Reasoning

05 · ground

Context
Grounding

04 · retrieve

Top-k Facts
Retrieved

07 · respond

Grounded
Response

08 · output

User Output

Facts, definitions, and relationships. What things are, independent of when they were learned.

Semantic memory is the agent's model of how the world works. User preferences ("Harnoor uses LangGraph"), domain facts ("Toronto is in Ontario"), business rules ("orders over $10k require approval") — anything that's a stable assertion about reality lives here. It's queried by similarity (vector search) or by relationship (graph traversal), often both.

The 2026 production trend is hybrid retrieval: dense embeddings plus BM25 keyword matching plus entity matching, all fused into one score. Pure vector similarity alone is no longer the default — graph memory has matured into the production stack.

Implementation: Vector DB · knowledge graph · Mem0 · LangMem
Best for: Personalization · factual grounding · policy lookup

005 / LTM · EPISODIC

Episodic Memory.

Working Semantic Episodic Procedural

01 · event

Interaction
/ Episode

02 · capture

Context · Time ·
Action · Outcome

03 · store

Temporal KG
/ Episode Store

06 · adapt

Few-Shot
Adaptation

05 · recall

Similar Episode
Retrieved

04 · query

Similarity +
Temporal Query

07 · decide

Decision
(case-based)

08 · output

Adapted
Response

Past interactions, situated in time. What happened, when, what worked.

Episodic memory stores experience. Each episode is a record — context, timestamp, action taken, outcome — and the agent retrieves similar past episodes to inform current decisions. This is case-based reasoning: instead of reasoning from first principles every time, the agent asks "have I been here before, and what worked?"

Implementation typically combines vector similarity (find episodes like this one) with temporal filters (recent episodes weighted higher). Zep's Graphiti and Letta's recall memory are the current production references; LangMem stores episodes as few-shot examples that get injected into prompts at decision points.

Implementation: Zep · Graphiti · Letta recall · time-indexed DB
Best for: Adaptive behavior · learning from failure · few-shot routing

006 / LTM · PROCEDURAL ← OFTEN OMITTED

Procedural Memory.

Working Semantic Episodic Procedural

01 · run

Agent Run

02 · execute

Tool Use
+ Output

03 · evaluate

Reflexion /
Eval Signal

06 · prompt

Updated
System Prompt

05 · optimize

Prompt
Optimizer

04 · store

Procedural
Memory Store

07 · apply

Next Run
(improved)

08 · loop

Continuous
Improvement

How the agent operates — rules, workflows, and self-updating system prompts. The memory almost every infographic forgets.

Procedural memory is the agent's skill set — its workflows, decision policies, and the system prompts that shape its behavior. Unlike semantic memory (facts about the world) or episodic memory (events that happened), procedural memory is how the agent does things: routing rules, error-recovery heuristics, tone, format, escalation policies.

The 2026 production pattern, codified by LangMem, treats system prompts as procedural memory and lets the agent rewrite them based on feedback. A Reflexion step evaluates the run, a prompt optimizer rewrites the instruction, and the next iteration runs with the improved prompt. This is how agents get measurably better without retraining the underlying model.

Implementation: LangMem procedural · self-updating system prompts · Letta core blocks
Best for: Continuous improvement · learned routing · agent persona

007 / PRODUCTION

The systems that actually implement this.

Six systems cover the production memory landscape in 2026. The right choice depends on whether you own the agent loop, whether memory is load-bearing, and how much control you need over the schema.

LangMem

SDK · LANGGRAPH-NATIVE

The canonical implementation of the 1+3 taxonomy. Sits on LangGraph's BaseStore and exposes semantic, episodic, and procedural memory through a unified API. My default for teams already on LangGraph.

Semantic Episodic Procedural

Mem0

CRUD MEMORY LAYER

The most-integrated memory layer in 2026 — covers 21+ frameworks across Python and TypeScript. Framework-agnostic: add it to any agent loop. Multi-signal retrieval (semantic + BM25 + entity matching). The pragmatic default when "remember the user" is the feature.

Semantic Episodic

Letta (formerly MemGPT)

SELF-MANAGING RUNTIME

OS-inspired memory: core memory in the context window (RAM), recall memory as searchable history (disk cache), archival memory as long-term tool-queryable store (cold storage). The agent self-edits its memory blocks via tool calls. Pick this when memory is the product.

Working Semantic Episodic

Zep / Graphiti

TEMPORAL KNOWLEDGE GRAPH

Episodic memory as a temporal knowledge graph. Every fact carries a timestamp, so the agent knows not just what was true but when it was true — essential when facts change (job titles, prices, policies). Best when the ordering of events matters to reasoning.

Episodic Semantic

LangGraph Checkpointers

WORKING MEMORY · PERSISTENCE

PostgresSaver, RedisSaver, SqliteSaver — these persist graph state for resumability and time-travel debugging. Thread-scoped, not user-scoped. Do not confuse with long-term memory.

Working

Provider Memory

CHATGPT · CLAUDE PROJECTS

Provider-managed memory built into the model platform. Zero effort, zero control — opaque schema, no programmatic query, no migration path. The right choice when the provider is your platform; the wrong choice the moment you need to audit, scope, or move it.

Semantic Episodic

One working memory.
Three long-term subtypes.
No more, no less.

Discuss on LinkedIn → ← More from Harnoor Get in touch

// References & sources

Sumers et al. — Cognitive Architectures for Language Agents (CoALA) · arXiv:2309.02427 · 2023
LangChain · LangMem SDK documentation — semantic, episodic, procedural memory
Mem0 · State of AI Agent Memory 2026 · benchmarks & multi-signal retrieval
Letta · Benchmarking AI Agent Memory: Is a Filesystem All You Need?
IBM · MongoDB · Redis — production memory architecture guidance, 2026
Atlan · Long-Term Memory in LangChain Agents · framework comparison

The taxonomy is a tree, not a row.

Memory is one slice of the agent loop.

Working Memory.

Semantic Memory.

Episodic Memory.

Procedural Memory.

The systems that actually implement this.

One working memory. Three long-term subtypes. No more, no less.

One working memory.
Three long-term subtypes.
No more, no less.