Multilingual AI Agent-powered Chatbot Supporting Journalist Training

pexels cottonbro

The Vstorm team created a RAG-based Agentic AI system that speaks both English and Arabic to support ARIJ Network in training new investigative journalists with reliable and fact-checked knowledge, free of hallucinations, as well as support their profitability by creating new income streams.

The Arab Reporters for Investigative Journalism (ARIJ) Network connects and trains investigative journalists who are native speakers of the Arabic language and work throughout the Middle East and North Africa.

The organization provides support and training for journalists, either investigative or aspiring. The organization was founded in Jordan in 2005 and now operates in 16 countries, during which time it has provided training to thousands of journalists and provided over 650 investigative journalism materials.

Vstorm’s impact, the TL;DR:

  • Before implementing the solution, only 1% of inquiries got responded
  • Vstorm’s solution from consists of an agent with three separate tools to orchestrate
  • Chatbot works in Moodle online learning environment
  • The tool works perfectly in English and Arabic without confusing the languages

The challenge:

As a knowledge-provider, ARIJ was looking for a way to smoothen the learning process. Initially, the team was capable of processing no more than 1% of requests from journalists, all done manually, limiting the automation possibilities and scalability.

Asking questions and getting answers is one of the key elements of the process, and that is where using a chatbot could prove to be an optimal solution. The system’s task was to reply with answers using the best knowledge available and only that gathered by the institution.

Yet building a chatbot for education comes with some challenges:

  • LLMs tend to hallucinate – an organization with a focus on education and knowledge sharing has little to no margin for the hallucinations or fakes often produced by LLMs, as the participants are going to apply the information in their real-life situations and jobs.
  • Knowledge is distributed, sometimes scattered – in the case of ARIJ, the knowledge base to be used by the chatbot consisted of numerous text documents that ranged from manuals for students to transcripts of lectures, making them challenging to scan through and analyze by traditional solutions.
  • The solution needs to find a place in an established process – ARIJ was already using workflows and procedures to work with students. The new element aimed to enrich the existing workflows, not disrupt them or force a redesign. In this particular case it was necessary to let the solution run on the Moodle Learning Management System already used by ARIJ in their day-to-day work.
  • The chatbot has to operate in English and Arabic – the solution is aimed at Arabic native speakers, yet the community itself is international, making two languages a necessity.

Building this chatbot and connecting it with the available knowledge base in a way to ensure the AI agent will not get confused or produce fake information was an ideal task for Vstorm.

Vstorm impact

Delivering this solution required the team to map the challenges to mitigate them in the design and planning phase.

Building the database

The database is the foundational technology of all AI agent solutions, and the ARIJ network was not different. To extract the required knowledge from the provided files, it was necessary to cut all the variable texts into batches.

The batches were later vectorized to make them searchable and processable for the agent. Also, in the vectorization process, the knowledge was clustered so pieces of information sharing the same topic were near each other, reducing the effort the model has to invest in looking for and providing answers.

Delivering the architecture for RAG

With the database in place, it was possible to deliver an agent that performs the task. The first step was to build a RAG (Retrieval Augmented Generation) solution to harvest and process data from the database.

  • What is RAG

In this particular case, it was a better approach to use the PyDantic framework to ensure the seamless operations of the agent.

Using the PyDaintic framework, it was possible to build the agent itself to use a set of tools:  one to connect with the database, one to validate the answers and ensure their language correctness, and the agent itself, delivering the best response possible for the prompt given.

Our team decided to use Gemini 2.5 Pro, as the model is versatile and powerful enough to handle the task of delivering reliable answers for educational purposes. Also, the model was tested and proven to work properly in English and Arabic alike, making it perfect for this particular use case.

Building the PoC and gathering feedback

The first version of the PoC included more data to be processed, including target page number, where the relevant information was found, but in the end, a more minimalist approach was enough to deliver the results desired.

Also, the model has a mechanism to validate if a particular question makes it necessary to connect with the database. If the user is just interacting casually with the model, or says things like “thank you,” then the model simply replies on its own, saving on costs of the full query procession.

Moodle implementation

Last but not least, the Agentic AI solution had to work in the Moodle environment. Moodle itself is an open source platform, based mostly on PHP and JavaScript, so implementing the new component required a different approach.

For smooth and reliable operations, the system also required a sign-in mechanism, as well as a payment solution that enabled users to switch from free to paid plans.

Vstorm’s solution

The Vstorm delivered AI agent is fully operational, delivering accurate and true answers without risk of hallucinating. The system operates in the Moodle environment, exactly where students and course participants need it.

With the support of the agent, users can get their answers without the need to search through the knowledge scattered throughout the database. Instead, the AI agent processes the query and provides them with the best answer possible. The tool is following the language of the conversation, giving responses in English or Arabic, depending on the language of choice of the user, without confusing or mixing them.

For the ARIJ, the delivered agent is not only the perfect support for the training process, supporting students in gathering information and solving problems, but has also become a supporting stream of income, as there is only a limited number of free answers provided for each user, with more being made available for purchase.

The LLM Book

The LLM Book explores the world of Artificial Intelligence and Large Language Models, examining their capabilities, technology, and adaptation.

Read it now
Services: