Split-and-Merge Retrieva

wojciech achtelik
Wojciech Achtelik
AI Engineer Lead
July 3, 2025
Glossary Category
RAG

Split-and-Merge Retrieval is a strategy for handling oversized documents in Retrieval-Augmented Generation and search pipelines. The method splits a long text into overlap-aware chunks that fit the retriever’s embedding window, performs independent semantic searches on each chunk, and then merges the top-ranked hits into a unified result list before passing context to the language model. This preserves recall on fine-grained facts buried deep in a PDF while avoiding token-waste on irrelevant sections. Key parameters include chunk length, overlap size, max-marginal-relevance (MMR) deduping, and a merge heuristic—weighted score, reciprocal rank fusion, or learned re-ranker. Metrics such as recall@k and token-cost ratio gauge benefit; typical wins are 10-20 % higher answer accuracy on sprawling manuals or chat logs. Split-and-Merge Retrieval pairs well with hierarchical chunking and sliding-window attention to keep latency under 300 ms even at million-token scale.