Improvements to the AI Reports

Issue created by @yautja_cetanu
Comment 8 months ago →
🇬🇧United Kingdom yautja_cetanu
Comment 8 months ago →
🇬🇧United Kingdom yautja_cetanu
Comment 8 months ago →
🇬🇧United Kingdom yautja_cetanu
Comment 8 months ago →
🇬🇧United Kingdom yautja_cetanu
Comment 8 months ago →
🇬🇧United Kingdom yautja_cetanu
Comment 8 months ago →
🇬🇧United Kingdom yautja_cetanu

Evaluations need to store the Agents that are working behind the scenes behind a assistant message history in details, with all the history and prompts and back and forth that exists there.

For each "thing" in the evaluation (like message) I should be able to click a button and see the actual POST message sent to the LLM (maybe in the logs).

We also want to store the "Drupal bit" the logs of the things Drupal has done between the agents working so we can trace.
Comment 8 months ago →
System Message

justanothermark → committed 3bcd6488 on 1.0.x
Issue #3487485: Add AiEvaluation entity view route and update listing to...
Comment 8 months ago →
🇬🇧United Kingdom justanothermark
yautja_cetanu → credited justanothermark → .
Comment 8 months ago →
🇬🇧United Kingdom yautja_cetanu
There are things I'm finding difficult to understand with the reports:

Mostly UX changes

I think there should be a clear place for the "System" prompt vs the actual prompt that is happening.

This is especially true for Agents, the specific question vs system prompt needs to be seperate with the system prompt being hidden in a collapsed thing. I really need to see the prompt given to the Agent and its response.

Can we do anything about the formatting of the responses? Hard to read what is going on.

Need to figure out how we can highlight the specific thing being evaluated, the specific prompt/ prompt to solve. I'm getting a lot of the history but I want the specific think to be evaluated. (Maybe highlight the specific message where it was ticked yes or no) and have that open by default?

It seems the thing that says "Prompt" is really the "System Prompt" and the thing that is called "Question" is "Prompt" - It's kind of a User prompt (even if the user is an Agent). Maybe Input Prompt is good?

I think we need to be able to see the most important things first with everything else in a drop-down. It's the User prompts + Response that matters the most.

Similarly configuration such as model should again be in a dropdown.

In the screenshot without agents, it doesn't show the Drupal_agent_assistant, why does it sometimes for message history and other times it doesn't?

I think it might be better to show a UI of the specific user message we're dealing with and its history of agents seperately from the general message history as we can't see any of the agents within historical user messages anyway so having a drop-down for each thing doesn't make much sense.

May require a refactor of Agents

I think "Comments" should be Message history and there should be a consistent method of showing message history for the assistant vs Agent. (This might need a little agent refactor itself). We probably don't need "Task Name" "Task Description"? Unless we want to keep using those features so it works better with MiniKanBan in which case we should make the Assistant come up with a Task Name and Description.

I think we should always ask the Agent to respond with something and than also offer an explanation for their response. We should have a consistent format for the "Response Message" vs "Explanation"

I think it makes sense that we can't see the agent history for each previous user message. However I think we should at leave a record of some of the agents called so that we could query the agents called by that user message.

Screenshots

Evaluations with no Agents called

Evaluations with sub-agents called
Merge request !9Issue #3487485 by justanothermark: Added Notes & Tags fields, updated reports and reorganised path structure. → (Open) created by justanothermark

Improvements to the AI Reports

Problem/Motivation

Proposed resolution

Remaining tasks

User interface changes

API changes

Data model changes

Merge Requests

!9Improvements to the AI Reports
Open

Comments & Activities

Improvements to the AI Reports

Problem/Motivation

Proposed resolution

Remaining tasks

User interface changes

API changes

Data model changes

Merge Requests

!9Improvements to the AI ReportsOpen

Comments & Activities

!9Improvements to the AI Reports
Open