Unims-RAG: A Unified Multi-Source Retrieval-Augmented Generation for Personalized Dialogue Systems

wojciech achtelik
Wojciech Achtelik
AI Engineer Lead
June 24, 2025

Unims-RAG: A Unified Multi-Source Retrieval-Augmented Generation for Personalized Dialogue Systems is an architecture that combines several knowledge streams—user profile, conversation history, domain database, and web search—into one coordinated RAG pipeline. A query-rewriter tags each turn with persona and situational cues, then a selector routes those cues to specialized retrievers (vector, keyword, graph). Retrieved passages are merged and deduplicated by a cross-encoder that scores both topical relevance and personalization fit. The aggregated context is injected into a prompt template so the large language model can craft responses that reflect the user’s preferences, past choices, and real-time facts while avoiding hallucinations. A reinforcement-learning loop updates retrieval weights based on satisfaction signals such as dwell time or explicit ratings, ensuring the system adapts to individual users over long sessions. Enterprises deploy Unims-RAG to power concierge bots, learning tutors, and healthcare assistants that require rich personalization without exposing private data, thanks to built-in PII masking and on-prem vector stores.