Retriever (RAG)

PG() fotor bg remover fotor bg remover
Bartosz Roguski
Machine Learning Engineer
Published: July 3, 2025
Glossary Category
RAG

Retriever (RAG) is the component in a Retrieval-Augmented Generation pipeline that locates the most relevant documents for a user query before a large language model (LLM) crafts the final answer. It converts the query into a vector embedding, searches a vector database with similarity metrics such as cosine or dot-product, and returns a top-k list of passages. Popular retriever types include dense semantic search (Sentence-BERT, OpenAI embeddings), hybrid BM25-plus-vector search, and re-ranked cross-encoders for higher precision. A well-tuned retriever boosts factual accuracy, lowers hallucinations, and trims token costs by feeding only high-value context into the LLM’s context window. Key settings—embedding model, k value, max marginal relevance (MMR), and metadata filters—balance recall versus latency. Monitoring recall@k and hit-rate guards against drift as content grows. In essence, the Retriever is the “memory lookup” engine that grounds generative AI in trustworthy knowledge.

Want to learn how these AI concepts work in practice?

Understanding AI is one thing. Explore how we apply these AI principles to build scalable, agentic workflows that deliver real ROI and value for organizations.

Last updated: August 1, 2025