AI Product Engineer: Day 4

Day 4: Memory for Agents

Day 4:Teaching Agents to Remember

Welcome to Lesson #4!

Today, you will be learning about Memory. A critical part of building effective agents. Here’s what we will cover:

  • Why memory is important

  • The difference between short-term and long-term memory

  • When and how to build each

  • 3 memory types

  • How memory works in production

If you missed the previous lessons, find them here:


👉 Day 1: Intro to Agents
👉 Day 2: Agents vs Workflows
👉 Day 3: RAG and Tool Use

If you want to break into AI, build your own AI tools, land a job offer, or even start a new AI business, then check out the next AI Engineer cohort in June 2025.

🚀 See students’ achievements: (Ekta), (Tamil), (Autumn)

I miss the bootcamp — it’s left a lasting impact on me. It’s motivated me to keep refining and fine-tuning my product, even after graduation. And made me a resilient person.

Quote from our graduate.

Short-Term vs Long-Term Memory

Every agentic system needs access to both short-term and long-term memory. Yesterday, you learned how RAG and tools let agents act. Today, you’ll explore how agents remember.

Let’s start with short-term memory

Short-term memory (STM) in AI agents provides a temporary workspace to process current tasks. It has limited capacity, and the information quickly fades unless actively maintained or transferred to long-term memory.

When is it used

  • AI chatbots: Remembering the last few user messages to maintain context in a chat interaction.

  • Gaming: Tracking a player’s immediate actions to respond dynamically.

  • Task Automation: Managing the current step in a multi-step process or workflow

What are the limitations

  • Limited capacity: STM can only store a small number of items

  • Short duration: Information is quickly overwritten or discarded after the task or interaction ends

  • Context loss: If the memory window is too small, the agent may miss important information

  • Prioritization: Requires decision-making on which information to retain or discard

What are the technical implementations

  • Context windows: Maintaining a rolling buffer of recent interactions or data points

  • Session variables: Storing per-session data for the duration of an interaction

  • Specialized libraries: Tools that can provide structured STM for agents

Long-Term Memory


Long-term memory (LTM) in AI agents enables the storage and retrieval of information across multiple sessions. This allows agents to learn from past experiences, personalize interactions, and improve performance over time.

When is it used

  • Personalized assistants: Remembering user preferences, past interactions, and historical data for tailored responses.

  • Recommendation systems: Storing user behavior and choices to refine future suggestions.

  • Case-based reasoning: Recalling specific past events or solutions to inform new decisions.

  • AI self-evolution: Accumulating knowledge and adapting behavior over time based on ongoing interactions.

What are common challenges

  • Scalability: Efficiently storing and retrieving large volumes of historical data is complex.

  • Relevance: Ensuring the agent retrieves the most pertinent information for the current context.

  • Data consistency: Managing updates and changes to long-term knowledge without conflicts.

  • Privacy and security

What are technical implementations

  • Databases and vector stores (e.g., SQL, NoSQL, or vector embeddings)

  • Knowledge graphs

  • RAG systems

3 Memory Types: facts, experiences, and skills

Another way to look at agentic memory is through the lens of Psychology and the CoALA research paper that divides memory into three complementary stores. Keep the questions in mind:

  • What is true?Semantic

  • What happened?Episodic

  • How do I do it?Procedural

Decision procedure diagram from CoALA paper (Sumers, Yao, Narasimhan, Griffiths 2024)

Episodic memory – the agent’s diary

Here is a detailed breakdown of the 3 types of memory

Memory Type

What It Stores

Purpose/Use Cases

Example in AI Agents

Typical Implementation

Semantic

General facts, concepts, rules, and definitions

Answering factual questions, reasoning, providing explanations

“What is Python?” → “A programming language.”

Knowledge bases, databases, knowledge graphs

Episodic

Specific events, user interactions, experiences

Personalization, continuity, recalling past sessions or actions

“Last time you asked about travel plans.”

Conversation logs, event databases, RAG

Procedural

How-to knowledge, learned skills, action sequences

Executing tasks, automating processes, following learned routines

Filling out a form, booking a meeting

Workflow engines, scripts, process models

Semantic memory – the agent’s encyclopedia

Holds declarative facts: names, prices, relationships. Picture a simple sentence like “Meri likes dark mode.” (P.S. I do prefer dark mode) The agent keeps thousands of those tiny sentences so it can quickly answer general questions.

Episodic memory - historical timestamps

The agent keeps a quick-changing timeline of what just happened: “User clicked checkout → payment failed”. By looking back through these notes, the agent can spot repeating trouble (like PayPal payments failing only in Internet Explorer 11) and instantly show the last right error log.

Procedural memory – the agent’s muscle memory

After lots of practice, the agent keeps a “when I see this, I do that” playbook, much like muscle memory for riding a bike. Some tasks are hard-coded as simple rules: “If the user asks for an invoice, run the generate-invoice step.”

Examples of how Memory works in production

  • Semantic memory: A support chatbot uses a vector database of product manuals and a knowledge graph to fetch accurate specs and troubleshooting steps on demand.

  • Episodic memory: A virtual assistant pulls recent chat transcripts and event logs from session archives to resume a user’s multi-step booking workflow without losing context.

  • Procedural memory: An order-processing pipeline relies on stored prompt templates and DAG-based workflow definitions to generate invoices, send confirmations, and update inventory automatically.

Putting Memory in Production

Storing memories is only half the job; retrieving and grounding them in an LLM prompt closes the loop.

Here is what you can use in production:

  • Knowledge graphs: Graph databases capture explicit entity relationships and enable precise retrieval for semantic grounding in prompts.

  • Relational & NoSQL databases: SQL tables and NoSQL JSON stores form a structured, GDPR-friendly source of truth for user profiles, event logs, and metadata.

  • Vector databases: To index high-dimensional embeddings and power millisecond semantic search and associative long-term memory across large document collections. (costly choice)

  • Retrieval-Augmented Generation (RAG): like we learned yesterday, RAG fetches top-k relevant snippets from memory stores and injects them into prompts to produce grounded, up-to-date model outputs.

Congratulations on completing Day 4!

You’ve just leveled up your agents with advanced memory management approaches.

Tomorrow, we’ll touch upon the most challenging part of building agentic systems: evals!

If you want to build real-world AI solutions with a hungry and curious group of engineers, join our upcoming AI Product Engineering Bootcamp in June 2025!

See projects from our graduates: (Kelsey), (Tamil), (Autumn)