Add artifacts to agents

Created on 6 June 2025, about 2 months ago

Problem/Motivation

Currently we expose any output and any parameters as text in the chat history. This works with most agents, but there are use cases where you want to work with bigger data chunks, that only matter for specific tools and not the history of the chat.

Think of the following scenario:

You have a tool that can scrape websites and you have a tool that can extract links from a website and then a tool that can screenshot website. Your agents task is to scrape a webpage and take a screenshot of all the links on it.

In reality the final history after all loops looks something like this (system prompt is something verbose about the above):

User: Can you scrape https://drupal.org and take screenshots of all the external links
---------------------------------------------------------------
Assistant: I will start by scraping the website
Tool Usage: scrape(https://drupal.org) with tool_id 1
---------------------------------------------------------------
Tool: <html><head><title>Drupal.org</title></head><body>[Loads of HTML with links]</body></html>
Tool_id: 1
---------------------------------------------------------------
Assistant: I will now extract the links
Tool Usage: extract_links(<html><head><title>Drupal.org</title></head><body>[Loads of HTML with links]</body></html>) with tool_id 2
---------------------------------------------------------------
Tool: https://wordpress.org, https://joomla.org, https://dri.es
Tool_id: 2
---------------------------------------------------------------
Assistant: I will now screenshot the links:
Tool Usage: screenshot(https://wordpress.org, https://joomla.org, https://dri.es) with tool_id 3
---------------------------------------------------------------
Tool: file_id 1, 2, 3
Tool_id: 2
---------------------------------------------------------------
Assistant: I have tooken screenshots here they are for https://joomla.org <img src="url to 1">.....

This will work, however the problem with this is that if a website is maybe around 50k tokens, you have had 3 loops of 50k tokens twice at a price of 300k tokens - something that can be measured close to ~1USD/EUR depending on your provider and some provider doesn't handle that limit even.

And in this case the actual html is not even needed to be read by the LLM.

What instead if we could do:

User: Can you scrape https://drupal.org and take screenshots of all the external links
---------------------------------------------------------------
Assistant: I will start by scraping the website
Tool Usage: scrape(https://drupal.org) with tool_id 1
---------------------------------------------------------------
Tool: !artifact:1
Tool_id: 1
---------------------------------------------------------------
Assistant: I will now extract the links
Tool Usage: extract_links(!artifact:1) with tool_id 2
---------------------------------------------------------------
Tool: https://wordpress.org, https://joomla.org, https://dri.es
Tool_id: 2
---------------------------------------------------------------
Assistant: I will now screenshot the links:
Tool Usage: screenshot(https://wordpress.org, https://joomla.org, https://dri.es) with tool_id 3
---------------------------------------------------------------
Tool: file_id 1, 2, 3
Tool_id: 2
---------------------------------------------------------------
Assistant: I have tooken screenshots here they are for https://joomla.org <img src="url to 1">.....

All of a sudden we have the same results, but we saved around 300k tokens.

Proposed resolution

  • First take a decision if artifacts is the correct naming - this is the naming in Langchain, but artifacts in Claude is something else.
  • Create an artifact interface with a get and set and id.
  • In the AiAgentForm for tools, make sure that we add a setting for the tools, that the output can be treated like an artifact.
  • In the AiAgentEntityWrapper make sure to store any output as an artifact after the tool is run.
  • In the AiAgentEntityWrapper make sure to replace any artifact with the real value.
  • Make sure to add the artifacts to the events.

Remaining tasks

User interface changes

API changes

Data model changes

✨ Feature request
Status

Active

Version

1.2

Component

Code

Created by

πŸ‡©πŸ‡ͺGermany marcus_johansson

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

  • Issue created by @marcus_johansson
  • πŸ‡¬πŸ‡§United Kingdom yautja_cetanu
  • πŸ‡ΊπŸ‡ΈUnited States Kristen Pol Santa Cruz, CA, USA

    Switching to the correct tag

  • πŸ‡¨πŸ‡¦Canada b_sharpe

    I can likely work on this but a few things:

    • Artifact seems like the wrong term. "Tool Result", "Tool Context", "Function Call Result", etc might be more appropriate. (going to use artifact for the sake of clarity in the next points)
    • How does a tool decide to use the artifact or create a new one? Using the example above, let's say you want to do a new screenshot, how do we tell the tool to run again instead of re-using the data
    • Can there be multiple artifacts per tool per chat? Are artifacts tied to ThreadID?
  • πŸ‡©πŸ‡ͺGermany marcus_johansson

    Awesome @b_sharpe!

    Artifact is a weird word, but looking into it, it seems to be the actual definition being used for this. Its came from Anthropic from the beginning and was kind of used actually now that I think about it for the same thing, but now its used elsewhere, see for instance: https://python.langchain.com/docs/how_to/tool_artifacts/ and https://google.github.io/adk-docs/artifacts/.

    CrewAI uses Artefact, but I guess that is just American vs British English. Since we use cspell with American English, we should keep the American version.

    Since we will be exposing tool and results via MCP and A2A to the outside world, we should use the same vocabulary as the rest of them for the data name at least - the visual name within Drupal we could make up our own.

    Regarding decision - I think the best course would be for now that this always happens because a human that sets up the agent knows if its needed or not. The decision is basically made out of four considerations as far as I can see it:

    1. Does the output of this tool only matter for other tools and not for the agent itself. If yes, it can be an artifact.
    2. Will the context of the tool be so large, that it affects the context length and starts creating hallucinations. Scraped websites for instance. If yes, it should be an artifact.
    3. Does the context itself include instructions - like a food recipe for instance - causing the agent to have instruction fatigue or even worse, change its course of actions. If yes, it should be an artifact.
    4. Is the output in such a weird format that the agent has a hard time understand how to pass it along, even with a good system prompt. For instance a binary. If yes, it should be an artifact.

    You could in theory have subagents take these decisions, but I think with how we see that not all agents are reliable, it makes sense to make this a human decision.

    This is why the issue is in AI Agents, rather the the AI module as well. This should be another option you can set on the tool when you set it up. So you check a checkbox that says "Artifact Output" with a description when you should enable it.

    We have something called AiDataTypeConverter plugins in the AI module, that are tools that hook in while the tool parameters are being filled in to do something similar to route arguments upcasting, for anyone knowing routing. This makes it possible for instance to set a ContextDefinition type to entity, and then ask for the agent to answer "node:1", but in the tool, you get returned the actual object.

    We should use this to create a data type converter for artifacts that can look for a magic prefix, something like "!artifact:" and then upcast it. Since we use the "{word}:" for entity upcasting, I think the exclamation mark, or something else makes sense, if someone for some reason creates an entity called artifact.

    I will add this information to the issue, since this functionality was added since we wrote the issue.

    A tool can be run multiple times, and you can have artifact output on multiple tools, so it should be possible. It think for contextual understanding just naming them "!artifact:{function_name}:{incremental_number}" make sense. They will be given on tool result as well, so the agent should have no problems glueing it together, specifically with a good system prompt.

  • πŸ‡©πŸ‡ͺGermany marcus_johansson

    Did some changes to the main issue as well, check if it's understandable now, otherwise I'll try to scope it out more.

  • πŸ‡¨πŸ‡¦Canada b_sharpe

    Ok, I think that all makes sense, I'll take a stab at it and report back

  • πŸ‡¨πŸ‡¦Canada b_sharpe

    @marcus_johansson: I just noticed the option for tools:

    Return directly

    Check this box if you want to return the result directly, without the LLM trying to rewrite them or use another tool. This is usually used for tools that are not used in a conversation or when its being used in an API where the tools is the structured result.

    Do we see this as separate? Or should we be refactoring this to instead be the use-case for artifacts?

Production build 0.71.5 2024