Improve handling of RAG/tool output to properly pass as tool messages for better formatting and reasoning

Created on 16 May 2025, 19 days ago

Problem/Motivation

Currently, when the RAG (Retrieval-Augmented Generation) search results are returned from the vector database, the raw tool output is passed directly as assistant messages in plaintext. This causes the AI assistant to treat these results as already “answered” content, preventing it from reformatting, interpreting, or following prompt instructions for rendering recipes, code blocks, images, etc.

The current implementation echoes RAG results directly as assistant message content (plaintext), e.g.:

echo $message->getText();
$full_response .= $message->getText();

This makes the assistant think it has already responded and does not perform further processing or formatting on the results.

Proposed resolution

Instead of passing RAG/tool results as plaintext assistant messages, they should be passed as tool role messages (with an appropriate tool name like search_rag) inside the conversation context. This way, the assistant recognises these as external tool outputs it should reason over, interpret, and incorporate into a well-formatted response.

For example, within the streaming response callback:

  • Detect messages with role tool and add them to the message context with their tool role and name.
  • Do not echo these tool outputs directly.
  • After collecting the tool output, call the assistant again with the full conversation context including the tool messages.
  • Echo the assistant’s final response, which can reformat and clean up the output properly.

This change aligns with best practices in prompt engineering for RAG pipelines, ensuring the assistant distinguishes between raw tool output and its own final response, improving output quality and user experience.

if ($message->getRole() === 'tool') {
  $this->aiAssistantRunner->addToolMessage('search_rag', $message->getText());
} else {
  echo $message->getText();
  $full_response .= $message->getText();
  ob_flush();
  flush();
}

Then request the assistant’s final formatted response and echo it.

📌 Task
Status

Active

Version

1.0

Component

AI Search

Created by

🇪🇸Spain gxleano Cáceres

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

  • Issue created by @gxleano
  • 🇪🇸Spain gxleano Cáceres
  • Pipeline finished with Failed
    19 days ago
    Total: 322s
    #498886
  • Pipeline finished with Canceled
    19 days ago
    Total: 187s
    #498890
  • Pipeline finished with Canceled
    19 days ago
    Total: 213s
    #498889
  • Pipeline finished with Failed
    19 days ago
    Total: 1015s
    #498891
  • 🇪🇸Spain gxleano Cáceres
  • 🇩🇪Germany marcus_johansson

    I don't think this is going to work for 1.0.x.

    We have added normalization of tools and tool results in 1.1.x and its a little bit more complex - the big issue is that different providers does different ways of expressing tools and tools history. Some requires the tool, to be added into the payload (Mistral for instance), not just the tool result. Some does not set the tool message using the tool role, but the user or assistant role with extra parameters (Bedrock for instance). Many require a unique tool_id to be set on the assistant message taking the pick and the role message. etc. So those would all break with this change.

    Also it would change the way it works for anyone having it working, on a patch, meaning breaking changes, even if it works better. in general So if we can somehow make it backwards compabitle in 1.0.x, it should be an option that is disabled on older assistants.

    We have RAG search working the way stated in this issue in 1.1.x and that release is hopefully just some weeks away from production release, so I would argue to wait to use that?

  • 🇪🇸Spain gxleano Cáceres

    Testing 1.1.x and this issue will be handled there with tooling approach, so I will close it.

    Thanks Marcus!

  • 🇪🇸Spain gxleano Cáceres
Production build 0.71.5 2024