AI -KR3 - Making experimenting feel safe.

Issue created by @yautja_cetanu
Comment 9 months ago →
🇬🇧United Kingdom yautja_cetanu
A couple of things from our initial investigations:

1. Workspaces looks good but doesn't support field config yet which is essential so its unlikely to be ready for RC1

Below is an AI Agent trying to add something to a vocab.

It shows up as not yet pushed.

2. We have verbose responses with links and details.

Here it is medium levels of verbosity. It tells you the category but it doesn't tell you literally every term its going to make. It does give you links to places where you can check the work.

Below is a details dropdown. It shows to a log of the actual things Drupal has done and this log is generated by code, not by an LLM and therefore will be completely accurate. However its what Drupal needs to know, it doesn't necessarily make sense to an end-user. We can at least improve it by making it so the labels are printed. But it will still be fairly techy.
Comment 9 months ago →
🇩🇪Germany marcus_johansson
First of all the Workspaces modules + WSE Config is really cool, however I did get stuck on most config entities where it currently won’t let me save them into a Workspace, which is how things are supposed to work as far as I could see. With Views it works.

I could have the chatbot switch over to its own workspaces anytime it takes an action and currently I had to manually publish the Workflow But it’s some work to get something useful there both from our side and on, I’m guessing, the WSE’s side.

In the Agent Interface we have added so that you can fetch all the content and config creations and changes as ids, this was not meant for the Chatbot UI from the beginning, but we have added a detail under the chat answer that can be toggled in the chatbot settings where you can see the actual changes made by the last response from the chatbot.

What has also been added, so far in the taxonomy and fields agent, is that they will return links for every part of the system they change. This means that the Chatbot can be prompted to return this, which I have updated it to do on the Drupal CMS MR. So any beginner to Drupal would find links in the response where they can verify and see where changes are made.

See screenshot below for example of both changes:

Currently the agents are hardcoded not to do any deletions, the question is if we should also extend this to editing of configs that have substantial (10+) content items connected to them? This would make sure that you have a tool that can help you with the initial building, but that can’t really do any changes that are not revertable without a database backup/Workspaces?
Comment 9 months ago →
🇬🇧United Kingdom yautja_cetanu
Comment 9 months ago →
🇬🇧United Kingdom yautja_cetanu
Comment 9 months ago →
🇬🇧United Kingdom yautja_cetanu
Comment 9 months ago →
🇬🇧United Kingdom yautja_cetanu
Comment 9 months ago →
🇬🇧United Kingdom yautja_cetanu
- We should have more details in the details. So if we're changing the name from X to Y it should say that. If its doing taxonomy terms it should show the term itself, not just the ID
Comment 9 months ago →
🇩🇪Germany marcus_johansson
From the fixed tickets and fixed structure on how to add original vs changed config and content.

Diff on config entities:

Labels on content entities:

Later I will see if I can force this via interface, so anyone doing agents has to follow this. It would also help with fixing of revert of the last run agent(s)
Comment 8 months ago →
🇬🇧United Kingdom yautja_cetanu
Comment 8 months ago →
🇬🇧United Kingdom yautja_cetanu
Comment 8 months ago →
🇬🇧United Kingdom yautja_cetanu
For Drupal CMS v1 we want to avoid anything "Destructive" - Which is defined as "Anything change that permanently deletes content we can't get back.

- We don't allow deleting any content, any terms, or deleting a field. AI can help the end-user do this manually.
- We have checked "Removing Alt Text" for an image field. And this doesn't delete anything. The Alt text is still there and when its enabled the field comes back.
- We want to test changing multiple cardinality so if you have cardinality of 3 and content uses 3 and then change it to 1 what happens (We might make it so for Drupal CMS v1 it simply can't change cardinality, or allow it to change it upwards.

In the future I think I'd like destructive changes if there is a single piece of content as its sometimes helpful when building things to have an example piece of content you don't mind if it gets destroyed.
Comment 7 months ago →
🇬🇧United Kingdom yautja_cetanu
Descisions:

We are happy with AI generating the Review Stage before it does something and the Recap Summary after is does something. This means there is a chance of hallucinations but it means things will be more readable.

We will turn off "Details" for now. The deterministic logs of actions will instead be stored in the logs, not directly available to the chatbot user.

We have decided that the "Review Stage" and "Recape" SHOULD use accurate Drupal terminology. However if the end-user uses less exact Drupal Terminology (such as categories), AI will explain the the User how and why Drupal uses the specific systems it does.

The Recap stage should have links to everything so the end-user can verify everything themselves.

We have decided that for Drupal CMS 1.0 we will expect the end-user to do undo's manually with the AI helping them do it for themselves. If Undo doesn't involve deleting things the AI can perform a new actions that undoes the previous action but it is not technically an "undo"

Attached is an example of a workflow everyone really likes. We need to edit the instructions so Drupal outputs the below model workflow without the user having to type those instructions in to force it.

Post Drupal CMS v1.0

We want to work on an approach for Review and Recap that is more accurate. For example. We may create forms for creating 100s of taxonomy terms that have a single purpose and an undo. We may use Workspaces. We may extend Experience Builder to forms and use their tools. We also may take the YAML file generates by "blueprints" that the AI follows and design a UI around that.

Explore approaches to Undo but need to remember that multiple people can work on the site at one time. If AI creates 15 taxonomy terms someone else could delete one before the chatbot user says undo. Need to explore that. (With Experience builder or Views, these issues are sidestepped as they are handled by the editing framework)

Model Conversation:

- Below is a picture that is formatted nicely and text that can by copied and pasted but isn't formatted nicely

AI: Hello! How can I help you today?

User:
I would like to categorize the events into terms of price ranges.

Can you do that for me?
Before you do anything, please explain exactly what you will be doing. If you use Drupal terms can you explain what those terms means in terms of the language I have chosen to use?

AI:
Certainly! Here's what I plan to do to categorize the events into terms of price ranges:
1. Create a Vocabulary: In Drupal, a "taxonomy" is used to categorize content. I'll create a new vocabulary named "Price Ranges". A vocabulary in Drupal is similar to a category type, where you can add terms/options (like Cheap, Moderate, Expensive) that can be selected to categorize content.
2. Add Terms to the Vocabulary: I'll populate the "Price Ranges" vocabulary with terms that represent different price levels (e.g., Cheap, Moderate, Expensive).
3. Add a Field to Events: Once the vocabulary is created, I'll attach it to the "Events" content type. This means I'll add a field to the Event content type that allows the person creating or editing an event to select one or more of the price range terms.
Would you like me to go ahead and set this up for you?

User:
Yes please and when you do it please give me a review step where you explain exactly what you've done and give me links to all the places where you've created things so I can check it myself.

AI:
I have set up the categorization for the events based on your specified price ranges. Here's what I did:
1. Created a Vocabulary:
○ A new vocabulary called "Price Ranges" was created to categorize events based on price.
○ You can review this vocabulary here: Price Ranges Vocabulary.
2. Added Terms to the Vocabulary:
○ I added three terms to the "Price Ranges" vocabulary: Cheap, Moderate, and Expensive.
○ You can view these terms here: Price Range Terms.
3. Added a Field to Events:
○ I attached the "Price Ranges" vocabulary to the Events content type by creating a new field.
○ This field allows event creators to select one of the price range terms.
○ You can review the field settings here: Event Fields and check the specific configuration here: Field Configuration.
Feel free to explore these links and let me know if you want any changes or need further assistance!
Comment 7 months ago →
🇬🇧United Kingdom catch
We are happy with AI generating the Review Stage before it does something and the Recap Summary after is does something. This means there is a chance of hallucinations but it means things will be more readable.

Testing on the original AI recipe issue showed that this is an inherently broken approach. For example the AI telling someone it had created a vocabulary with 20 terms when it instead it had created a vocabulary with a single term with the label '20 terms'. This was found via a single person doing testing so it seems highly likely it will be a common outcome.

It is not hard to produce a human readable summary of changes made to a Drupal site, just needs a list of 'created x type with y label' type messages. Might not be as fluid but it won't be misrepresented.

It's bad enough when the changes themselves are bad but the AI 'hallucinating' about what's it's done is even worse.
Comment 7 months ago →
🇬🇧United Kingdom yautja_cetanu
📌 Change instructions to add a review stage to the assistant Active Issue to implement the changes to the instructions.

Note: Catch, the Details that you can see in the issue above which are "Generating a list of 'created x type with y label' type messages might not be as fluid but it won't misrepresent what happens."

Still exist.

The list is generated and stored in a log so devs can use it for debugging.

The AI is writing its recap summarising the details.

The details is a setting in the module. So we can turn it on if we want to very easily. A siteadmin can turn it on if they want it. We're just turning it off by default for Drupal CMS 1.0.

For Drupal CMS 1.1 we will be meeting weekly with a UX team to figure out how to improve this further which can include:

The generated list of details being written more effectively.

Expanding "Blueprints" so that preview step generates a list of deterministic commands that are implemented with an explicit button press (instead of the AI agents implementing it).

Using more Javascript + Streaming so that the AI manually implements everything before your eyes. (It works well in XB but a bit harder with fields).

Workspaces use fields.

AI -KR3 - Making experimenting feel safe.

Problem/Motivation

Proposed resolution

Comments & Activities