Collaboration with existing projects

Created on 7 March 2024, 11 months ago
Updated 8 July 2024, 7 months ago

Problem/Motivation

Project description lists

draws inspiration from two LLM modules that utilize ChatGPT (OpenAI), specifically the OpenAI / ChatGPT Integration and ChatGPT Content Assistant. However, these projects currently lack an abstraction layer for the utilization of other models.

But there is quite a few more modules integrating different LLMs. The large list is here:
#3346258-2: [META] Drupal could be great for building AI tools (like ChatGPT)

The largest module ecosystems are:

There is also directly similar module AI models library , which is deprecated in favor of having many many modules, like these:
https://www.drupal.org/project/ollama
https://www.drupal.org/project/huggingface

Overall there is AI initiative https://www.drupal.org/project/artificial_intelligence_initiative

And we recently discussed about integrating many LLMs in these issues:
Add support in Augmentor AI for alternative AI-service(s) Active
Add support in AI Interpolator for Hugging Face Fixed
Add support for self-hosted AI solutions such as Ollama in OpenAI Assistants Active

I hope these links are helpful.

And I approve - your module is really needed. But the only problem is that it will be a lot of work to support many many LLMs in another module ecosystem.

Let's discuss here or in referenced issues how to integrate a lot of LLMs while also deduplicating work and ensuring coherent Drupal AI ecosystem.

Proposed resolution

Remaining tasks

User interface changes

API changes

Data model changes

Feature request
Status

Fixed

Version

1.0

Component

Code

Created by

🇱🇹Lithuania mindaugasd

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

  • Issue created by @mindaugasd
  • 🇱🇹Lithuania mindaugasd

    Updated description (fixed link)

  • 🇱🇹Lithuania mindaugasd

    Also

    LLM Chat Example submodule enables simple chat interactions with the accessible LLMs

    there is a dedicated module for chat UI https://www.drupal.org/project/aichat

  • 🇬🇧United Kingdom seogow

    After reviewing other modules, I've concluded that the decision to deprecate the AI models library was justified. Consolidating every possible API call implementation into a single module would have been impractical.

    However, I see great alignment between this module and your vision of having numerous modules integrate AI APIs. Its design specifically accommodates various implementations, even allowing for competition among implementations of the same API, while providing a unified interface for AI backend/frontend modules. This approach ensures that these modules remain functional regardless of the API used. I suggest we start with the OpenAi Specification as our foundation.

    I am open to sharing this module's maintenance responsibilities, allowing for the addition and enhancement of the service interface as new integrations necessitate changes or new functionalities.

    I believe the best way forward would be to open this module to the authors of OpenAI / ChatGPT Integration and AI Interpolator , enabling them to provide a fully implemented OpenAI Provider and incorporate the abstraction into their modules. This would then unleash the power of all their submodules to everyone. I would be happy to provide Mistral and Anthropic Providers, so out of the box, we would have a wide range of functionality supported by three powerful LLMs.

  • 🇱🇹Lithuania mindaugasd

    1) Enabling to swap LLM of openai Proposal for Implementing LLM Abstraction in OpenAI Drupal Modules Active module is good step.

    2) Second step as you mentioned, could be creating modules 'anthropic' and 'mistral'. Those modules could follow 'huggingface' example, but instead of sub-modules (example here), they could provide 'llm_provider' plugin without a need to enable any sub-module. So enabling 'anthropic' module would provide a plugin for 'llm_provider' automatically for very simple structure and use.

    Completing first 2 steps, would allow users of 'openai' module to quickly swap provider for 'mistral' or 'anthropic'.

    3) Third step can be creating llm_provider submodule called 'llm_provider_aichat'. So it would use aichat for chatting with AI, and AIChat in turn would use llm_provider for integrating with all LLMs.

    4) Fourth step, huggingface and ollama modules would create llm_provider plugins as well within their directories, enabling aichat and openai with even more choice of LLMs!

    I like this project more and more :-) It solves a lot for aichat, as well as, LLM swap for 'openai' is a great idea.

    I suggest we start with the OpenAi Specification as our foundation.

    https://github.com/BerriAI/litellm this is a python library which managed to unify all LLMs (and images, and function calling, and everything else) into a single API, so a great blueprint for llm_provider as well. I guess it is the same or very similar to openai specification.

    AI Interpolator, enabling them to provide a fully implemented OpenAI Provider and incorporate the abstraction into their modules

    Agree. As with aichat, interpolator will also benefit integrating with all similar LLMs all at once.

  • 🇱🇹Lithuania mindaugasd

    Here are some links of amazing things one with AI interpolator is able to do:

    So if we talk about Python, Drupal is competitive in this regard.

  • 🇬🇧United Kingdom seogow

    I hope I will have time to investigate AI Interpolator more next week.

    But I agree with the plan - I shall write Providers for all the big LLMs and let know maintainers of the biggest Drupal AI ecosystems that here is available interface which can serve them all. Let's see what happens.

  • 🇬🇧United Kingdom robert castelo

    I think this is a promising approach.

  • 🇬🇧United Kingdom seogow

    I work on the Big Three (Mistral, Anthropic and OpenAI) now. When that is done (I hope April 2024), I will provide patches for already existing most interesting/useful 3rd party modules (hopefully at the same time).

  • 🇱🇹Lithuania mindaugasd

    I am looking forward to integrate LLM provider with AI chat user interface Rebase on AI module's abstraction layer Active . Recently I created "default" AI chat user interface backend, which will just work by default Provide default backend replacing example backend Fixed and now would like to upgrade it to use all LLM provided services.

    So there will be no need to create a submodule, because it can just work by default by enabling aichat UI.

    But when we have a question, how to configure each service. Options:

    1. Have model configuration inline within general Conversation entity type settings - this would be super convenient, because would allow to fine-tune configuration inline for each type without need to open other windows
    2. Separate window configuration - this approach would be the same as Augmentor module. I explored integrating with augmentor initially as noted in this issue Provide method to override Augmentor settings Needs review . Augmentor was not usable for chat before, but lately they upgraded API making this possible but it is not most optimal
  • 🇬🇧United Kingdom seogow

    I believe the 'LLM Provider Service' module needs to provide just an interface, say a class to extend, for each LLM functionality, from chat to video generation.

    Huggingface still duplicates code (tokens/credentials storage).

    I envisage the workflow as follows:

    • LLM Provider module (e.g. Drupal LLM Provider for the LM Studio API ) extends the functionality specific LLMProvider class(es) provided by 'LLM Provider Service module', consisting of API compulsory settings (access means like credentials, token or URL etc.), functional compulsory settings (e.g. text input, binary image data etc.) and optional functional settings (everything else, along with default values and descriptions).
    • Any frontend which consumes the client via LLM Provider obtains this information from a Drupal service call, according to a selected model (above) and fills it in order to use the service. Exception is thrown if some compulsory information is missing or empty.

    So to answer your question, IMHO having model configuration inline would be the best approach. The 'LLM Provider Service' module should not care if you use a wrapper entity or any other means of data management at the client side, but I like the usage of the Key module for credentials/keys because that way you can share these between models (e.g. different models in the OpenAI Stack).

    The reasoning for the above is that at the end of the day we want to enable users to use different credentials on the same site (say a user with the role 'OpenAI consumer' can use their own credentials to access the service via enabled frontend and store these via Key), and that is only possible with an inline approach.

    I hope the above makes sense.

  • 🇱🇹Lithuania mindaugasd

    That is good and simple design.

    Maybe going forward, llm_provider could provide a standard form element, something like:

    $form['configure_llm'] = [
      '#type' => 'llm_provider',
      '#label' => $this->t("AI model configuration form"),
      ...configuration of the element...
    ]
    

    a little similar like https://www.drupal.org/project/inline_entity_form
    or similar as https://www.drupal.org/project/addressfield
    or simplest example like https://www.drupal.org/project/multivalue_form_element

    So choosing a different model, ajax could load configuration for that model.
    For example https://platform.openai.com/docs/api-reference/chat/create has great deal of configuration (which I never use:)
    A bit similar like with AI chat UI backends. Each backend plugin having customized config form inline (which has a little bug 🐛 After choosing a different backend, api key options disappear Active at the moment)
    O a bit similar like addressfield, choosing a different country, there is a different address structure.
    (Depending how they managed to accomplish standardization in https://github.com/BerriAI/litellm)

    And even for more abstraction later on, there could be a field storage created, so that data storage could also be taken care of if needed
    in structured a bit similar way like https://www.drupal.org/project/datafield
    or saved as json text.

    It could be a 3 step development, each step increasing feature set:
    1) A class
    2) Form element
    3) Field storage

    Maybe Key module could be made a requirement to use llm_provider for standardization, but openai module don't use it, so maybe not.

  • 🇱🇹Lithuania mindaugasd

    Maybe each module (mistral, anthropic, huggingface...) global configuration form could be the same as inline configuration form (share the same code).
    So there would be global model configuration page, and overridden configuration within in specific context by using a form element.

  • 🇬🇧United Kingdom seogow

    I believe your suggestion about field storage is just the thing we need. I am going to update the code with working example on how it can work.

  • 🇱🇹Lithuania mindaugasd

    But I think form element and field storage should be separate/independent, because I would like to implement llm_provider without creating hard dependency on its data layer: without requiring to implement its storage (adding its basefield) and making modules (aichat and llm_provider) inseparable.

    For example, inline_entity_form is only a widget without its own storage and uses generic reference field for storage. And it can also be used as a form element too without being a field widget.

    So form could be implementable in a flexible way like inline_entity_form is:

    1. as form element;
    2. as field widget (selectable for generic text field and storing data as json);
    3. maybe special field storage.
  • 🇱🇹Lithuania mindaugasd

    I just consulted chatgpt, one can also create a soft dependency on field storage like this

         // Check if the contrib module is enabled or if the field type exists.
        $moduleHandler = \Drupal::service('module_handler');
        if ($moduleHandler->moduleExists('name_of_contrib_module') || FieldStorageConfig::loadByName('entity_type', 'field_name')) {
          // Add a custom field provided by a contrib module
          $fields['your_custom_field'] = BaseFieldDefinition::create('contrib_field_type')
            ->setLabel(t('Your Contrib Field'))
            ->setDescription(t('A descriptive text for your contrib field.'))
            ->setSettings([
              // Specify any settings required by the contrib field type
            ])
            ->setDisplayOptions('view', [
              'label' => 'hidden',
              'type' => 'default',
              'weight' => -4,
            ])
            ->setDisplayOptions('form', [
              'type' => 'default',
              'weight' => -4,
            ])
            ->setRequired(FALSE);
        }
    

    But removing such dependency later on after there is already data might be complicated.

  • 🇱🇹Lithuania mindaugasd

    Also configuration entities (bundle configurations) are not fieldable to begin with (for example, "conversation type" itself), so its only possible to store data in JSON.
    So not doing custom storage and storing data in json is a way to go.

  • 🇬🇧United Kingdom seogow

    I will introduce an API Configuration Entity.

    The plan is to walk away from any hardcoded API implementations completely, allowing user to define API specification using Entity API. It sill will be possible to write a module with Entity configuration to have instant API definition (e.g. manually or via configuration import), but it will not be necessary. The LLM Provider Service will itself provide default API Configuration Entity bundles and Entity instances for the most common APIs.

    The powers are:

    1. You, as a service subscriber, can use the service provided by this module with just ID of any API Configuration Entity available to you, which has bundle functionality compatible with your module requirements. It will just work out of the box.
    2. You are completely free to create new API Configuration Entity or clone any existing API Configuration Entity available to you. All you need to make a successful call to API is to provide ID of the new entity you want to use.
    3. As Drupal provides all the tools for entities out of the box, you can:
      1. Use Views to organise and show API Configuration Entities.
      2. Use API Configuration Entities as resource of any select list.
      3. Perform CRUD operations on API Configuration Entities (according to your permissions).

    That way we allow users to use any available LLM API, even the ones which nobody hardcoded, via simply creating an API Configuration Entity with correct bundle, which would be defined by its functionality and compulsory API message input type(s) and API response types:

    1. Chat bundle: text | text
    2. Image generation bundle: text | binary data
    3. Image description bundle: binary data | text
    4. Embedding bundle: text | text
    5. Tokenizer bundle: text | text
    6. ...

    The beauty of this architecture is, that if you want to provide any form/field sanitisation (e.g. input text length, input name, description etc.) in your frontend, you can always load API Configuration Entity (by ID, name, again Drupal has it all) and obtain the information from the actual object.

  • 🇬🇧United Kingdom seogow

    I am not thinking about configuration entities, but real entities.

    In fact, I shall employ this https://www.drupal.org/project/eck .

  • 🇱🇹Lithuania mindaugasd

    I don't like this new radical direction change with entities :-)

    Configuration entities:

    1. Configuration entities are in essence .yaml files. They don't have bundles, views or other things which you have described. They do have some sort of overriding feature with helper modules, but as things have been evolving, I don't know all current details (and maybe its not relevant).
    2. We talked about that inline configuration would be great, but it would be gone as per your new description
    3. Result would be the same as augmentor module already do https://www.drupal.org/project/augmentor , so why new module, if its almost the same.
    4. Users should not worry about complexity behind the scenes. APIs can be defined in code in as standard way, while users need to configure the settings only. There are some modules which allow to configure APIs in UI, so one can already implement API in UI if they would prefer. I did not find them now, but they exist [I will update the comment when I find those modules]. I don't use it (yet), because writing code is simpler to implement API.
    5. I don't analyse parts of your description, because those terms (bundles, views, fields) don't exist for configuration entities to begin with.

    Content entities:

    1. I don't know many (or any) modules which use ECK as their base (this description about developers might be from 10 years ago as it also supports Drupal 7). Entities are best described in code without ECK (like I did with aichat and aiprompt modules). ECK is great for site builder, but I don't think it is a solution to develop modules with.
    2. Content entities are content. While configuration of entity structure can be highly complex (hundreds of yaml files). I am not sure how could you keep the structure of it consistent. define/share/clone/export/import it in code, while keeping in mind that content is in the database, which depends on your structure. Like changing some field in code, you have to migrate the content... How would you keep it consistent?

    I suggest a different powerful feature of drupal - it is "plugins". Maybe they could be useful for llm_provider. For example, all it takes to create a new backend for AI chat user interface - is just writing this little file: https://git.drupalcode.org/project/aichat/-/blob/1.0.x/modules/aichat_ba...

    But my plugin system will still need to be ever simpler when feature set grows. One of the solution is to outsource various features to other modules, for example, outsource model configuration to llm_provider to continue keep complexity low while continue growing features at the same time.

  • 🇬🇧United Kingdom seogow

    Oh, I thought you might not love it :) It seems simplistic at the surface, but it really isn't.

    Let me show you the dev implementation first (including one plugin for every LLM module which allows these very now) and then we can either move forward (beta), or deprecate this module in favour of some better LLM Service project/module. I do not want mess, I want a go-to solution for everyone.

    I do not duplicate Augmentor approach, nor Interpolator approach. Not even Huggingface approach. It is really about offering standard Interface agnostic to backend API. In this case even backend implementation. The good thing about it is, that both Augmentor and Interpolator accept plugins now. And what LLM Provider Service will be good at is automatically providing plugins for everything when any new LLM API is added to it. And even better it will be for modules which would use its service directly.

    But as I've said - going to work now and ping you soon, when ready :)

  • 🇱🇹Lithuania mindaugasd

    deprecate this module in favour of some better LLM Service project/module

    Original version up until entities is good, so not clear why to deprecate. Let's see how the prototype will work.

  • 🇬🇧United Kingdom seogow

    I have updated the module. It now allows for inline configuration and its interface is much clearer.

    I have decided not to include the entity-based configuration in the 'Provider Manager Service', but rather to create a 'Flexible LLM Provider' later (with Ollama and Mistral as examples). That way the 'Provider Manager Service' stays lightweight and non-opinionated.

    Feel free to use it now - if you see nothing wrong with it, I will publish 1.0. And I am of course happy to incorporate any changes which would make it better.

    The usage is simple (and shown in detail in the example module 'llm_provider_chat'):

    1. $instance->llmServiceManager = $container->get('llm_provider.manager') - this gives you the service;
    2. $this->llmServiceManager->getModels([Bundles::Chat->name]) - this gives you a list of all the available LLM Chat services in the Manager (without parameter you simply are getting all types of LLM available). The example module provides a method which creates options for Drupal Select from this array.
    3. $this->llmServiceManager->getResponse($provider, $model, $input, $authentication=[], $configuration=[], $normalise_io=TRUE) - here you make the call. As you can see, you can (but do not have to) change any configuration, including credentials.
  • 🇬🇧United Kingdom seogow

    To test, simply enable Drupal LLM Provider for the LM Studio API along with running LM Studio as a server with any Chat LLM.

  • 🇱🇹Lithuania mindaugasd

    Priority improvements I would like to see:

    1. Works with Drupal out-of-the-box without installing a 3rd party app on the server.
    2. Unified/simplified configuration across different LLMs. For example, temperature to work the same predictable way. This can be accomplished either by:
      • Configuration being the same across LLMs; or
      • Configuration being different, but configuration form being provided out-of-the-box (inline) [this probably makes more sense, because users won't need to develop their own configuration forms each time; and enables configuration form to be adapted to service provider]
  • 🇬🇧United Kingdom seogow
    1. I have added a support for Huggingface - just enable the module along with the LLM Provider manager and add your key to /admin/config/huggingface/settings.
    2. Now, the module has the ability to automatically generate configuration forms for any LLM Provider. The HuggingFace one is already implemented and when the module is enabled, you can see it at: /admin/config/system/llm-provider-manager-settings.
    3. All the configurations are now managed by Drupal, and are fully editable inline at the call time. The common settings like 'temperature' or 'top_k' are shared between models - see the implementation at /admin/config/llm-provider-chat-example.

    Let me know what do you think.

  • Assigned to seogow
  • 🇬🇧United Kingdom seogow
  • 🇱🇹Lithuania mindaugasd

    Thanks, looks good from me for now. Next step is to try integrating with aichat Rebase on AI module's abstraction layer Active (I don't know when yet)

  • Status changed to Closed: won't fix 7 months ago
  • 🇬🇧United Kingdom seogow

    No further development will be done in this module, it has been deprecated in favour of AI module.

  • Status changed to Fixed 7 months ago
  • Automatically closed - issue fixed for 2 weeks with no activity.

Production build 0.71.5 2024