- Issue created by @marcus_johansson
- πΊπΈUnited States Kristen Pol Santa Cruz, CA, USA
Switching to the correct tag
- π¨π¦Canada b_sharpe
I can likely work on this but a few things:
- Artifact seems like the wrong term. "Tool Result", "Tool Context", "Function Call Result", etc might be more appropriate. (going to use artifact for the sake of clarity in the next points)
- How does a tool decide to use the artifact or create a new one? Using the example above, let's say you want to do a new screenshot, how do we tell the tool to run again instead of re-using the data
- Can there be multiple artifacts per tool per chat? Are artifacts tied to ThreadID?
- π©πͺGermany marcus_johansson
Awesome @b_sharpe!
Artifact is a weird word, but looking into it, it seems to be the actual definition being used for this. Its came from Anthropic from the beginning and was kind of used actually now that I think about it for the same thing, but now its used elsewhere, see for instance: https://python.langchain.com/docs/how_to/tool_artifacts/ and https://google.github.io/adk-docs/artifacts/.
CrewAI uses Artefact, but I guess that is just American vs British English. Since we use cspell with American English, we should keep the American version.
Since we will be exposing tool and results via MCP and A2A to the outside world, we should use the same vocabulary as the rest of them for the data name at least - the visual name within Drupal we could make up our own.
Regarding decision - I think the best course would be for now that this always happens because a human that sets up the agent knows if its needed or not. The decision is basically made out of four considerations as far as I can see it:
1. Does the output of this tool only matter for other tools and not for the agent itself. If yes, it can be an artifact.
2. Will the context of the tool be so large, that it affects the context length and starts creating hallucinations. Scraped websites for instance. If yes, it should be an artifact.
3. Does the context itself include instructions - like a food recipe for instance - causing the agent to have instruction fatigue or even worse, change its course of actions. If yes, it should be an artifact.
4. Is the output in such a weird format that the agent has a hard time understand how to pass it along, even with a good system prompt. For instance a binary. If yes, it should be an artifact.You could in theory have subagents take these decisions, but I think with how we see that not all agents are reliable, it makes sense to make this a human decision.
This is why the issue is in AI Agents, rather the the AI module as well. This should be another option you can set on the tool when you set it up. So you check a checkbox that says "Artifact Output" with a description when you should enable it.
We have something called AiDataTypeConverter plugins in the AI module, that are tools that hook in while the tool parameters are being filled in to do something similar to route arguments upcasting, for anyone knowing routing. This makes it possible for instance to set a ContextDefinition type to entity, and then ask for the agent to answer "node:1", but in the tool, you get returned the actual object.
We should use this to create a data type converter for artifacts that can look for a magic prefix, something like "!artifact:" and then upcast it. Since we use the "{word}:" for entity upcasting, I think the exclamation mark, or something else makes sense, if someone for some reason creates an entity called artifact.
I will add this information to the issue, since this functionality was added since we wrote the issue.
A tool can be run multiple times, and you can have artifact output on multiple tools, so it should be possible. It think for contextual understanding just naming them "!artifact:{function_name}:{incremental_number}" make sense. They will be given on tool result as well, so the agent should have no problems glueing it together, specifically with a good system prompt.
- π©πͺGermany marcus_johansson
Did some changes to the main issue as well, check if it's understandable now, otherwise I'll try to scope it out more.
- π¨π¦Canada b_sharpe
Ok, I think that all makes sense, I'll take a stab at it and report back
- π¨π¦Canada b_sharpe
@marcus_johansson: I just noticed the option for tools:
Return directly
Check this box if you want to return the result directly, without the LLM trying to rewrite them or use another tool. This is usually used for tools that are not used in a conversation or when its being used in an API where the tools is the structured result.
Do we see this as separate? Or should we be refactoring this to instead be the use-case for artifacts?
- Assigned to b_sharpe
- π¨π¦Canada b_sharpe
Also, so far in my tests, it appears the exclamation mark is not a good placeholder as the AI providers discard it often thinking it's a typo, so I've switched to:
{{artifact:$this->toolId:$this->index}}
The problem now; however, is the AI provider doesn't know how to use this, so sometimes it won't select tool 2, other times it will just pass random data to it and fail. I'm not sure how the AI provider is going to return tool 2 as one of it's tool_calls if it doesn't know it has the proper data? I've tried adding some instruction like:
$output = 'The tool output has been stored as an artifact placeholder ' . (string) $artifact;
but this doesn't seem to help. Any thoughts?
- π¨π¦Canada b_sharpe
Just putting in the MR for visibility. I have not addressed the form item yet so it artifacts EVERY tool currently, but I wanted to point out the real issue here with unknown output as in my previous comment (#12) β¨ Add artifacts to agents Active .
I've gotten a little further in which a single-run, mult-tool response is working and the artifact is getting set/replaced, but it's about 40% of the time. I truly believe the only way this is going to work is for tools to specify output context and then tools needing that context can know it's coming from there regardless of if the value is an artifact or not.
- π©πͺGermany marcus_johansson
Thanks for the updates @bryan - sorry about late responses, I'm on vacation at the moment, so things are moving a little slowly on my end.
Return directly
I think this is a separate thing. The idea with this is that when you know that any tools response is good enough to return to whatever consumer is using the agent you do not need the agent to produce a textual response or loop another loop to figure out if its done. An example of this for instance is if you have a field validation agent using a tool for validating fields and it has used that tool - you 100% know that this is the last thing it should do, so you stop and return there and then.
Also, so far in my tests, it appears the exclamation mark is not a good placeholder as the AI providers discard it often thinking it's a typo, so I've switched to
Great!
I'm not sure how the AI provider is going to return tool 2 as one of it's tool_calls if it doesn't know it has the proper data? Are we expecting agent prompting to know this and reference the artifacts in the prompt?
It can only know it using artifacts if its explained in the system prompt what the artifact will be used for and how or if we add in some way for the agent to temporarily read the artifact on on loop. In the end, this has to be on the person setting up the agent to write a working prompt for, something like (very simplified and will not work every time):
You will have three tools at your disposal, one to scrape a website, one to summarize and one to put the summary into a card component. Its important that you do the following. 1. Figure out if there is some website that we can scrape, if not just answer that you can't help unless they provide an actual website. 2. Use the scraping tool to scrape each website given. The output will be given as an artifact with a unique token, instead of the full html. 3. Summarize the websites using the summarizing tool. The input of the text to summarize should just be the unique token give by the website. 4. Use the card component generation tool, to generate the component, create the title from the summary and use the summary as the description.
As far I understand it, and it makes sense, the information in an artifact can never be of importance for the agents decisions or it can only be important if we have some way of adding it dynamically once per loop (if wanted). So the decision making in that case has to go into the system prompt. But I haven't explored the theme so extensively, so I could be wrong here.
I truly believe the only way this is going to work is for tools to specify output context and then tools needing that context can know it's coming from there regardless of if the value is an artifact or not.
Just to get an understanding, what does output context refer to here? Could you give an example of how it would work, or if you have time and believe its the way to go forward, try to test it, even outside of Drupal context to see if it helps?
I'll start adding some comments into the code so far - it looks great!
- π¨π¦Canada b_sharpe
Ok, I'll go ahead with just assuming the user prompts will take care of it for now as you suggested and we can regroup after.
RE: Output context, something like what is being done with Tool API β where the tool defines what it needs and what it provides both with Context, that way we wouldn't need to rely so much on the prompting as the artifact would at least know what it represents
- π¨π¦Canada b_sharpe
Ok, I've added the form option now along with some instructions on how to use within the prompt. I tested with a multi-step function call and it is doing the replacement properly and using it in subsequent calls.
- πΊπΈUnited States michaellander
My understanding is artifacts are generally something that AI creates. In our case we also want pointers to things that may already exist and that we are modifying. Like if we ask AI to create a node, to me it's an artifact, if we ask it to load a node, is it still an artifact? Even if in both cases we intend to modify and save them. I just want to make sure we are using the correct terminology and would love to find some precedent somewhere.