Create Document Loader Normalization Layer

Created on 5 June 2025, about 2 months ago

Problem/Motivation

Currently we do normalize a lot of operation types of standardized processes within AI, but one part that is really important to do is to be able to provide document context for the AI, some types where you can create this is for instance:

  1. Load a PDF to markdown
  2. Load an Excel file to CSV
  3. Load a Google Sheets to CSV
  4. Scrape a webpage and get the html
  5. Screenshot a webpage and get the image
  6. Do a search and get links

There are a multitude of services for this, and depending on your budget, hosting posibilities and other factors there might be reason for you to pick one over the other, in the same way that you pick it for AI models.

Since we want to ship things like agents or automators in recipes, we do not want it to be hardcoded to a specific service, same way as with the LLM. If and agent that creates content types for instance should be able to look at a website for inspiration, we do not care if you use ScrapingBot, Guzzle, Chrome or something else - its for the end user to figure out the best way of doing it according to budget, hosting, stability, quality etc, we just care that its a possibility.

To be able to do this we need a new plugin system for this and a way of setting your default document loader of a specific type in either AI module or from the outside.

Proposed resolution

This is up for discussion, so if you see a better path forward, please write it in the comments:

Create a new plugin type, plugin interface, attribute and plugin manager for a plugin called DocumentLoader
Create an abstract class for it.
It needs to have a way to expose a config form, probably use PluginFormInterface.
On attribute level you should be able to set label, description, document_loader_type (array) and output_type (array)
Create a copy of the operation types called document loader types - they will reside in src/DocumentLoaderType
Create a DocumentLoaderInput interface, DocumentLoaderOutput interface and DocumentLoaderType interface - the can be empty for now, just for classification.
DocumentLoaderOutput should have outputs for text, markdown, csv and html, even if not all works.
Create an attribute called DocumentLoaderType where you can set a label, description and an interface that extends DocumentLoaderTypeInterface.
Create a settings page where you can set default DocumentLoader plugins per DocumentLoaderType
Create helper method to get default loader for any type in the plugin manager
Create one example of a PDF to Document, where the document output can be text, markdown, csv and html.

Remaining tasks

User interface changes

API changes

Data model changes

Feature request
Status

Active

Version

1.2

Component

AI Core module

Created by

🇩🇪Germany marcus_johansson

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

Production build 0.71.5 2024