Improvements to the AI Reports

Created on 14 November 2024, about 1 month ago

Problem/Motivation

It's quite difficult to figure out what is going on. with the reports.

Proposed resolution

I think for each row we should have:

  • User ID
  • First User Message
  • Last User Message
  • Last Assistant Message
  • Thread ID
  • Date/ Time
  • Tags
  • Notes
  • Details

If I click Thumbs up for multiple messages I think we need some kind of thread ID to know they are part of it.

In Details we have the full chat history and all the configuration options such as

  • Agents Used
  • Model Config
  • Pre-Prompt Instructions
  • Pre-Action Prompt
  • Assistant Message
  • Agents used
  • Can we also store the flow of messages for the agents, not just assistants?

We need to remove System Role from the module itself, so also from the report.
In the Message history for "Role: user Message:" can we make that bold?

Remaining tasks

User interface changes

API changes

Data model changes

📌 Task
Status

Active

Version

1.0

Component

Code

Created by

🇬🇧United Kingdom yautja_cetanu

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

  • Issue created by @yautja_cetanu
  • 🇬🇧United Kingdom yautja_cetanu
    • Evaluations need to store the Agents that are working behind the scenes behind a assistant message history in details, with all the history and prompts and back and forth that exists there.
    • For each "thing" in the evaluation (like message) I should be able to click a button and see the actual POST message sent to the LLM (maybe in the logs).
    • We also want to store the "Drupal bit" the logs of the things Drupal has done between the agents working so we can trace.
  • 🇬🇧United Kingdom yautja_cetanu

    There are things I'm finding difficult to understand with the reports:

    Mostly UX changes

    • I think there should be a clear place for the "System" prompt vs the actual prompt that is happening.
    • This is especially true for Agents, the specific question vs system prompt needs to be seperate with the system prompt being hidden in a collapsed thing. I really need to see the prompt given to the Agent and its response.
    • Can we do anything about the formatting of the responses? Hard to read what is going on.
    • Need to figure out how we can highlight the specific thing being evaluated, the specific prompt/ prompt to solve. I'm getting a lot of the history but I want the specific think to be evaluated. (Maybe highlight the specific message where it was ticked yes or no) and have that open by default?
    • It seems the thing that says "Prompt" is really the "System Prompt" and the thing that is called "Question" is "Prompt" - It's kind of a User prompt (even if the user is an Agent). Maybe Input Prompt is good?
    • I think we need to be able to see the most important things first with everything else in a drop-down. It's the User prompts + Response that matters the most.
    • Similarly configuration such as model should again be in a dropdown.
    • In the screenshot without agents, it doesn't show the Drupal_agent_assistant, why does it sometimes for message history and other times it doesn't?
    • I think it might be better to show a UI of the specific user message we're dealing with and its history of agents seperately from the general message history as we can't see any of the agents within historical user messages anyway so having a drop-down for each thing doesn't make much sense.

    May require a refactor of Agents

    • I think "Comments" should be Message history and there should be a consistent method of showing message history for the assistant vs Agent. (This might need a little agent refactor itself). We probably don't need "Task Name" "Task Description"? Unless we want to keep using those features so it works better with MiniKanBan in which case we should make the Assistant come up with a Task Name and Description.
    • I think we should always ask the Agent to respond with something and than also offer an explanation for their response. We should have a consistent format for the "Response Message" vs "Explanation"
    • I think it makes sense that we can't see the agent history for each previous user message. However I think we should at leave a record of some of the agents called so that we could query the agents called by that user message.

    Screenshots

    Evaluations with no Agents called

    Evaluations with sub-agents called

Production build 0.71.5 2024