RAG (Retrieval-Augmented Generation)

wojciech achtelik
Wojciech Achtelik
AI Engineer Lead
July 2, 2025
Glossary Category
RAG

RAG (Retrieval-Augmented Generation) is the process of optimizing the output of a large language model, so it references an authoritative knowledge base outside of its training data sources before generating a response. RAG is a technique that enables large language models (LLMs) to retrieve and incorporate new information, where LLMs do not respond to user queries until they refer to a specified set of documents. RAG combines an information retrieval component with a text generator model and can be fine-tuned with its internal knowledge modified in an efficient manner without needing retraining of the entire model. RAG is an architecture for optimizing the performance of an artificial intelligence (AI) model by connecting it with external knowledge bases, helping large language models deliver more relevant responses at a higher quality.