Split-and-Merge Retrieva

Antoni Kozelski
CEO & Co-founder
Published: July 3, 2025
Glossary Category
RAG

Split-and-Merge Retrieval is a strategy for handling oversized documents in Retrieval-Augmented Generation and search pipelines. The method splits a long text into overlap-aware chunks that fit the retriever’s embedding window, performs independent semantic searches on each chunk, and then merges the top-ranked hits into a unified result list before passing context to the language model. This preserves recall on fine-grained facts buried deep in a PDF while avoiding token-waste on irrelevant sections. Key parameters include chunk length, overlap size, max-marginal-relevance (MMR) deduping, and a merge heuristic—weighted score, reciprocal rank fusion, or learned re-ranker. Metrics such as recall@k and token-cost ratio gauge benefit; typical wins are 10-20 % higher answer accuracy on sprawling manuals or chat logs. Split-and-Merge Retrieval pairs well with hierarchical chunking and sliding-window attention to keep latency under 300 ms even at million-token scale.

Want to learn how these AI concepts work in practice?

Understanding AI is one thing. Explore how we apply these AI principles to build scalable, agentic workflows that deliver real ROI and value for organizations.

Last updated: August 4, 2025