- Issue created by @marcus_johansson
Currently we do normalize a lot of operation types of standardized processes within AI, but one part that is really important to do is to be able to provide document context for the AI, some types where you can create this is for instance:
There are a multitude of services for this, and depending on your budget, hosting posibilities and other factors there might be reason for you to pick one over the other, in the same way that you pick it for AI models.
Since we want to ship things like agents or automators in recipes, we do not want it to be hardcoded to a specific service, same way as with the LLM. If and agent that creates content types for instance should be able to look at a website for inspiration, we do not care if you use ScrapingBot, Guzzle, Chrome or something else - its for the end user to figure out the best way of doing it according to budget, hosting, stability, quality etc, we just care that its a possibility.
To be able to do this we need a new plugin system for this and a way of setting your default document loader of a specific type in either AI module or from the outside.
This is up for discussion, so if you see a better path forward, please write it in the comments:
Create a new plugin type, plugin interface, attribute and plugin manager for a plugin called DocumentLoader
Create an abstract class for it.
It needs to have a way to expose a config form, probably use PluginFormInterface.
On attribute level you should be able to set label, description, document_loader_type (array) and output_type (array)
Create a copy of the operation types called document loader types - they will reside in src/DocumentLoaderType
Create a DocumentLoaderInput interface, DocumentLoaderOutput interface and DocumentLoaderType interface - the can be empty for now, just for classification.
DocumentLoaderOutput should have outputs for text, markdown, csv and html, even if not all works.
Create an attribute called DocumentLoaderType where you can set a label, description and an interface that extends DocumentLoaderTypeInterface.
Create a settings page where you can set default DocumentLoader plugins per DocumentLoaderType
Create helper method to get default loader for any type in the plugin manager
Create one example of a PDF to Document, where the document output can be text, markdown, csv and html.
Active
1.2
AI Core module