- Meri Nova
- Posts
- AI Product Engineer Day 3
AI Product Engineer Day 3
Day 3: Solving Search with RAG and Tool Use
Day 3: Solving Search with RAG and Tool Use
Welcome to Lesson #3!
Today, we’re covering RAG and Tool Use. A key to grounding your agents in real-world data and real-world actions.
Today, you will learn:
What is Tool Use
Misconceptions about RAG
How we teach RAG
Biggest challenges with RAG
RAG as Tool Use
OpenAI Agent SDK Tools
If you missed the previous lesson, find them here:
Day 1 lesson on “Intro to Agents”
Day 2 lesson on “Agents vs Workflows”
If you want to break into AI, build your own AI tools, land a job offer, or start a new AI business, then check out the next AI Engineer cohort in June 2025.
What Is Tool Use?
Tool Use refers to the capability of an LLM to interact with and utilize external systems, data sources, or functions to accomplish tasks that go beyond inherent LLM capabilities.
Tools can fetch data, call APIs, execute code, or interact with users.
For instance:
Access External Information, like RAG: Fetch data from APIs, query databases (SQL or vector DBs ), or search the web for current information.
Perform Actions: Execute specific code functions, trigger events in other applications (like sending emails or creating tickets), or interact with files.
Utilize Specialized Capabilities: Interact with hosted services provided by platforms (like OpenAI's Built-In search tools)
Today, you will learn how to build production-grade RAG systems and how to use Tools with the OpenAI Agent SDK.
How others teach RAG?
What they say:
“Embed your docs”
“Convert them to the embeddings.”
“Dump them into a vector store”
“Query with cosine similarity”
🚫 Why That Approach Fails in Real Life
I hate to tell you this, but nobody in an actual AI team or especially on the business side cares that you know how to "embed a PDF with OpenAI."
What they care about is this:
Can you build a reliable, secure, low-latency, production-ready search layer for an LLM to use?
RAG isn’t a toy project. It's how startups and enterprises are rethinking internal knowledge bases, document Q&A, and more.
How we teach RAG?
We treat RAG as a multi-step search problem orchestrated by an application layer.

Example of a RAG pipeline
RAG isn’t just about tossing queries into a vector store, it’s a full search strategy.
📌 For example:
You have a new user query and a large collection of potential information sources, and you need to find the specific pieces relevant to that query.
How you solve this search problem depends heavily on the type of data your RAG system needs to find, structured, unstructured, both?
See the full-stack example of a RAG pipeline below.
Example with Hai (our coach)
Understand & Refine the Query: Before searching, the system first works to understand the user's true intent. This often involves:
(Optional) Query Rewriting: Using an LLM, the system might rephrase the query for clarity.
(Optional) Filter Extraction: The system might extract specific keywords, categories, or metadata filters from the query
Targeted Information Retrieval: With a refined understanding of the query, the system retrieves potentially relevant information. The strategy depends on the data source:
Structured Data (e.g., SQL DB):
Unstructured Data (e.g., PDFs, text, images):
Re-rank & Filter for Relevance: The initial retrieval might yield many results, some irrelevant. To improve precision we can use:
Re-ranking
Filtering
Grounded Generation: Finally, the curated, relevant, and filtered information is packaged with the user's query and clear instructions (like "answer only from these sources") and sent to the LLM.
RAG is a search problem at its core.
Biggest Challenges with RAG:
There are numerous challenges, but today we will cover 3:
Retrieval is the Hardest Part: Core difficulty lies in the "Retrieval" step – getting the right information to the LLM is the main challenge. It's fundamentally a complex search problem.
Optimizing Recall vs. Precision: There's a trade-off. You might retrieve all potentially correct documents (high recall), but if you retrieve too many irrelevant ones (low precision), it confuses the LLM. Over-fetching can lead to poor generation quality or hallucinations.
Handling Contradictory or Outdated Information: The retrieved sources might contain conflicting details (e.g., documentation from different years). The system needs mechanisms like filtering by date/authority or smart prompting to handle this.
There are dozens more relevant challenges with RAG, that we didn’t cover today.
If you want to learn more, join our next cohort at AI Product Engineer Bootcamp that starts in June 2025.
Learn and apply your skills, building production-grade RAG applications.
Built-In Tools from OpenAI Agent SDK
OpenAI's Agent SDK comes with several hosted tools that empower agents to perform complex tasks:
WebSearchTool: Enables agents to fetch real-time information from the internet.
FileSearchTool: Allows agents to retrieve information from OpenAI Vector Stores.
ComputerTool: Grants agents the ability to perform tasks on a computer
Using Hosted Tools in Your Agent
Here’s a quick example of how to equip your agent with hosted tools using the OpenAI SDK:
from agents import Agent, FileSearchTool, Runner, WebSearchTool
agent = Agent(
name="Assistant",
tools=[
WebSearchTool(),
FileSearchTool(
max_num_results=3,
vector_store_ids=["VECTOR_STORE_ID"],
),
],
)
async def main():
result = await Runner.run(agent, "Which coffee shop should I go to, taking into account my preferences and the weather today in SF?")
print(result.final_output)
This agent is equipped with two powerful tools - WebSearchTool and FileSearchTool. When given a natural language prompt, it can pull live data from the web and context from a vector store to generate a personalized answer.
It’s a prime example of RAG and Tool Use working together in action.
Congratulations on completing Day 3!
You’ve just leveled up your agents with two game-changing capabilities: RAG and Tool Use
Tomorrow, we’ll explore another critical ingredient: memory.
You’ll learn how to teach your agents to remember past interactions, construct context, and build more reliable agentic systems.
If you want to learn how to implement these workflows with industry-leading standards, join our upcoming AI Product Engineering Bootcamp in June 2025!
If you have any additional comments, suggestions, or feedback, respond to this email directly. We’d love to hear from you!
How was today's newsletter? |