Meri Nova
Posts
AI Product Engineer: Day 4

AI Product Engineer: Day 4

Day 4: Memory for Agents

Meri Nova
May 06, 2025

Day 4:Teaching Agents to Remember

Welcome to Lesson #4!

Today, you will be learning about Memory. A critical part of building effective agents. Here’s what we will cover:

Why memory is important
The difference between short-term and long-term memory
When and how to build each
3 memory types
How memory works in production

If you missed the previous lessons, find them here:

👉 Day 1: Intro to Agents
👉 Day 2: Agents vs Workflows
👉 Day 3: RAG and Tool Use

If you want to break into AI, build your own AI tools, land a job offer, or even start a new AI business, then check out the next AI Engineer cohort in June 2025.

🚀 See students’ achievements: (Ekta), (Tamil), (Autumn)

I miss the bootcamp — it’s left a lasting impact on me. It’s motivated me to keep refining and fine-tuning my product, even after graduation. And made me a resilient person.

Quote from our graduate.

Short-Term vs Long-Term Memory

Every agentic system needs access to both short-term and long-term memory. Yesterday, you learned how RAG and tools let agents act. Today, you’ll explore how agents remember.

Let’s start with short-term memory

Short-term memory (STM) in AI agents provides a temporary workspace to process current tasks. It has limited capacity, and the information quickly fades unless actively maintained or transferred to long-term memory.

When is it used

AI chatbots: Remembering the last few user messages to maintain context in a chat interaction.
Gaming: Tracking a player’s immediate actions to respond dynamically.
Task Automation: Managing the current step in a multi-step process or workflow

What are the limitations

Limited capacity: STM can only store a small number of items
Short duration: Information is quickly overwritten or discarded after the task or interaction ends
Context loss: If the memory window is too small, the agent may miss important information
Prioritization: Requires decision-making on which information to retain or discard

What are the technical implementations

Context windows: Maintaining a rolling buffer of recent interactions or data points
Session variables: Storing per-session data for the duration of an interaction
Specialized libraries: Tools that can provide structured STM for agents

Long-Term Memory

Long-term memory (LTM) in AI agents enables the storage and retrieval of information across multiple sessions. This allows agents to learn from past experiences, personalize interactions, and improve performance over time.

When is it used

Personalized assistants: Remembering user preferences, past interactions, and historical data for tailored responses.
Recommendation systems: Storing user behavior and choices to refine future suggestions.
Case-based reasoning: Recalling specific past events or solutions to inform new decisions.
AI self-evolution: Accumulating knowledge and adapting behavior over time based on ongoing interactions.

What are common challenges

Scalability: Efficiently storing and retrieving large volumes of historical data is complex.
Relevance: Ensuring the agent retrieves the most pertinent information for the current context.
Data consistency: Managing updates and changes to long-term knowledge without conflicts.
Privacy and security

What are technical implementations

Databases and vector stores (e.g., SQL, NoSQL, or vector embeddings)
Knowledge graphs
RAG systems

3 Memory Types: facts, experiences, and skills

Another way to look at agentic memory is through the lens of Psychology and the CoALA research paper that divides memory into three complementary stores. Keep the questions in mind:

What is true? → Semantic
What happened? → Episodic
How do I do it? → Procedural

Decision procedure diagram from CoALA paper (Sumers, Yao, Narasimhan, Griffiths 2024)

Episodic memory – the agent’s diary

Here is a detailed breakdown of the 3 types of memory

Memory Type	What It Stores	Purpose/Use Cases	Example in AI Agents	Typical Implementation
Semantic	General facts, concepts, rules, and definitions	Answering factual questions, reasoning, providing explanations	“What is Python?” → “A programming language.”	Knowledge bases, databases, knowledge graphs
Episodic	Specific events, user interactions, experiences	Personalization, continuity, recalling past sessions or actions	“Last time you asked about travel plans.”	Conversation logs, event databases, RAG
Procedural	How-to knowledge, learned skills, action sequences	Executing tasks, automating processes, following learned routines	Filling out a form, booking a meeting	Workflow engines, scripts, process models

Semantic memory – the agent’s encyclopedia

Holds declarative facts: names, prices, relationships. Picture a simple sentence like “Meri likes dark mode.” (P.S. I do prefer dark mode) The agent keeps thousands of those tiny sentences so it can quickly answer general questions.

Episodic memory - historical timestamps

The agent keeps a quick-changing timeline of what just happened: “User clicked checkout → payment failed”. By looking back through these notes, the agent can spot repeating trouble (like PayPal payments failing only in Internet Explorer 11) and instantly show the last right error log.

Procedural memory – the agent’s muscle memory

After lots of practice, the agent keeps a “when I see this, I do that” playbook, much like muscle memory for riding a bike. Some tasks are hard-coded as simple rules: “If the user asks for an invoice, run the generate-invoice step.”

Examples of how Memory works in production

Semantic memory: A support chatbot uses a vector database of product manuals and a knowledge graph to fetch accurate specs and troubleshooting steps on demand.
Episodic memory: A virtual assistant pulls recent chat transcripts and event logs from session archives to resume a user’s multi-step booking workflow without losing context.
Procedural memory: An order-processing pipeline relies on stored prompt templates and DAG-based workflow definitions to generate invoices, send confirmations, and update inventory automatically.

Putting Memory in Production

Storing memories is only half the job; retrieving and grounding them in an LLM prompt closes the loop.

Here is what you can use in production:

Knowledge graphs: Graph databases capture explicit entity relationships and enable precise retrieval for semantic grounding in prompts.
Relational & NoSQL databases: SQL tables and NoSQL JSON stores form a structured, GDPR-friendly source of truth for user profiles, event logs, and metadata.
Vector databases: To index high-dimensional embeddings and power millisecond semantic search and associative long-term memory across large document collections. (costly choice)
Retrieval-Augmented Generation (RAG): like we learned yesterday, RAG fetches top-k relevant snippets from memory stores and injects them into prompts to produce grounded, up-to-date model outputs.

Congratulations on completing Day 4!

You’ve just leveled up your agents with advanced memory management approaches.

Tomorrow, we’ll touch upon the most challenging part of building agentic systems: evals!

If you want to build real-world AI solutions with a hungry and curious group of engineers, join our upcoming AI Product Engineering Bootcamp in June 2025!

See projects from our graduates: (Kelsey), (Tamil), (Autumn)

Use an Early Access Coupon Code “EARLY” for $210 OFF!

Our previous cohort in March, 2025.