LangChain reranker
The LangChain reranker is an optional post-processing step that reorders the documents returned by vector search before they are fed to LLM. After a similarity query extracts the top k fragments, a reranker — typically a cross-encoder such as Cohere Rerank, OpenAI’s text-embedding-ada-rerank, or the mini-LM model — evaluates each fragment against the user query for semantic relevance. The wrapper class implements the DocumentTransformer interface: reranker = CohereRerank; docs = reranker.transform_documents(query, docs). Because it evaluates query-document pairs jointly, the reranker corrects false positives, improves answer accuracy, and allows you to lower k while saving tokens. It inserts into any chain or Retrieval-Augmented Generation (RAG) agent with a single line of Python and hooks the callback to log latency and delta estimation for A/B testing. By combining fast approximate search with precise reranking, LangChain reranker produces higher quality, citation-ready answers without the need to deploy a larger vector index.