Ollama LLM Provider

Created on 10 June 2024, 6 months ago
Updated 22 July 2024, 4 months ago

Problem/Motivation

Add an Ollama LLM abstraction layer into the core AI module as one of the important LLM's to handle out of the box.

Remaining tasks

Discuss

  • 🔲 Should Ollama provider be in core? If yes, communicate this with @orkut-murat-yılmaz and also maybe get him onboard
  • 🔲 Should AI Interpolator rules and Search API AI Plugins be part of core?
  • More? (some issues have been raised, please feel free to add them here)

Tasks

📌 Task
Status

Fixed

Version

1.0

Component

Code

Created by

🇩🇪Germany marcus_johansson

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

  • Issue created by @marcus_johansson
  • 🇩🇪Germany marcus_johansson

    If we decide that this should be in core, we need to communicate this with @orkut-murat-yılmaz and also maybe get him onboard.

    We also need to decide if the AI Interpolator rules and Search API AI Plugins are part of core as well. I would say it makes sense to have something that does something out of the box.

  • 🇱🇹Lithuania mindaugasd

    Commented on this here 📌 [meta] Discussion: what LLM providers to include Active and here ✨ Create AI ecosystem "add-ons" page Active .

  • 🇬🇧United Kingdom yautja_cetanu

    Features Olama might need:

    • Some ability to do content moderation or use another LLM's https://www.drupal.org/project/ai/issues/3454452 ✨ [META] Create an AI Security module for custom moderation calls Active
    • Ability to find models that work with Olama in either some browser or by copying and pasting IDs from an external site
    • Ability to just download and install a model directly on the server
    • Ability to connect to another model hosted on another server using ollama
    • Perhaps the ability for the Drupal site to have remote control over the other ollama server so can download and setup models directly on that
  • 🇱🇹Lithuania mindaugasd

    Few more tasks:

    • Prepare information what kind of hardware is needed to run ollama, and how much does it cost
    • Document how to, or code a feature to shutdown GPU instance when ollama is not in use

    Completing these tasks, we could figure how many people can afford this in practice, and how cost effective it can be to run.

    One real use-case of this module: installing it locally for people who have a decent GPU on their local computer.

    Because of these constraints, it should probably be outside of AI module (not included), unless we find out that it can be practical for most people.

  • 🇬🇧United Kingdom yautja_cetanu

    That is worth doing but a couple of things:

    • ollama can run a lot of opensource models. Some are very tiny and can be run on any server without a GPU
    • This provider here will be a better use case for what you bought up. For people installing it locally with a decent GPU, LMStudio's UI is just so much easier than Olama. https://www.drupal.org/project/ai/issues/3453592 📌 LM Studio LLM Provider Active
    • I think Olama will eventually be for organisations that really want this for privacy reasons. Whilst documenting pricing in documentation, especially in your AI Initiative page will be very helpful. Knowing what people have now is probably not helpful as many clients are talking about wanting self-hosted AI anyway because the privacy is worth whatever cost.
    • I think an Ollama implementation is much more likely to be included in Starshot. Gabor was inspired by: https://hacks.mozilla.org/2024/05/experimenting-with-local-alt-text-gene...
  • 🇱🇹Lithuania mindaugasd

    Some are very tiny

    What can one do with a tiny model. Maybe some specialized automation.

    worth whatever cost

    Does it need to be included in Drupal CMS for everybody then. For clients who have enough resources for this, agencies/developers can set it up for them.

    Gabor was inspired

    In general, people show demand to experiment with local AI. So if there is demand for whatever reason, it could be included.

    included in Starshot

    Another question is how to make it easy enough and accessible to regular Drupal CMS users. How does one install it on the server actually. How much knowledge, investment and experience does it require.

  • Status changed to Needs review 5 months ago
  • 🇩🇪Germany marcus_johansson

    So an initial version of this provider is done and can be test on DEV.

    The first version has no controlling of Ollama, like pulling/deleting models. This has to be done via command line at the moment. But as soon as that is done, its usable.

    Someone should test it with the explorers.

    It supports chat and embed for now. Text completion when I get the time to add that.

  • 🇩🇪Germany marcus_johansson

    Regarding document how to, or code a feature to shutdown GPU instance when ollama is not in use.

    This should be done in a host solution like a Runpod module or something similar. I have been thinking about building such a solution. Anyway, it should not be in the AI module, the external modules can talk to the AI module for events.

  • Status changed to Needs work 5 months ago
  • 🇩🇰Denmark ressa Copenhagen

    Thanks @Marcus_Johansson, I tried the module, and almost got it working ...

    System

    • Debian 12
    • DDEV 1.23.2

    Modules

    I enabled these modules:

    • AI Core
    • Key
    • AI API Explorer
    • Ollama Provider

    drush in ai key ai_api_explorer provider_ollama

    I see that key module is a requirement in the ai module, but Ollama has no need for a key ... maybe the requirement should be set under the individual providers instead?

    $ grep -iR -A 2 "dependencies:" .
    ./ai.info.yml:dependencies:
    ./ai.info.yml-  - key:key
    

    Setup Ollama Authentication

    I entered:

    • Host Name: http://127.0.0.1
    • Port: 11434

    Ollama

    Ollama is available:

    $ curl 127.0.0.1:11434
    Ollama is running
    

    Two Ollama models are installed:

    $ ollama list
    NAME                 	ID          	SIZE  	MODIFIED    
    dolphin-llama3:latest	613f068e29f8	4.7 GB	2 weeks ago	
    llama3:latest        	365c0bd3c000	4.7 GB	3 weeks ago	
    

    Two Ollama models are available via command line:

    $ curl 127.0.0.1:11434/api/tags
    {"models":[{"name":"dolphin-llama3:latest","model":"dolphin-llama3:latest","modified_at":"2024-06-06T22:40:03.181858319+02:00","size":4661235994,"digest":"613f068e29f863bb900e568f920401b42678efca873d7a7c87b0d6ef4945fadd","details":{"parent_model":"","format":"gguf","family":"llama","families":["llama"],"parameter_size":"8B","quantization_level":"Q4_0"}},{"name":"llama3:latest","model":"llama3:latest","modified_at":"2024-05-27T12:53:40.033272983+02:00","size":4661224676,"digest":"365c0bd3c000a25d28ddbf732fe1c6add414de7275464c4e4d1c3b5fcb5d8ad1","details":{"parent_model":"","format":"gguf","family":"llama","families":["llama"],"parameter_size":"8.0B","quantization_level":"Q4_0"}}]}

    AI Chat Explorer

    When I select Ollama from the dropdown, "Provider Configuration" pops up but is empty, and I get an error "Error message -- Oops, something went wrong. Check your browser's developer console for more details." where I should have had the two models presented I guess?

    From the console:

    XHRPOST
    https://drupal10.ddev.site/admin/config/ai/development/chat-generation?ajax_form=1&_wrapper_format=drupal_ajax
    [HTTP/2 500 Internal Server Error 55ms]
      
    POST
      https://drupal10.ddev.site/admin/config/ai/development/chat-generation?ajax_form=1&_wrapper_format=drupal_ajax
    Status
    500
    Internal Server Error
    VersionHTTP/2
    Transferred4.69 kB (4.32 kB size)
    Referrer Policystrict-origin-when-cross-origin
    Request PriorityHighest
    
    Uncaught 
    Object { message: "\nAn AJAX HTTP error occurred.\nHTTP Result Code: 500\nDebugging information follows.\nPath: /admin/config/ai/development/chat-generation?ajax_form=1\nStatusText: Internal Server Error\nResponseText: The website encountered an unexpected error. Try again later.GuzzleHttp\\Exception\\ConnectException: cURL error 7: Failed to connect to 127.0.0.1 port 11343 after 0 ms: Couldn't connect to server (see https://curl.haxx.se/libcurl/c/libcurl-errors.html) for http://127.0.0.1:11343/api/tags in GuzzleHttp\\Handler\\CurlFactory::createRejection() (line 210 of /var/www/html/vendor/guzzlehttp/guzzle/src/Handler/CurlFactory.php). GuzzleHttp\\Handler\\CurlFactory::finishError(Object, Object, Object) (Line: 110)\nGuzzleHttp\\Handler\\CurlFactory::finish(Object, Object, Object) (Line: 47)\nGuzzleHttp\\Handler\\CurlHandler->__invoke(Object, Array) (Line: 28)\nGuzzleHttp\\Handler\\Proxy::GuzzleHttp\\Handler\\{closure}(Object, Array) (Line: 48)\nGuzzleHttp\\Handler\\Proxy::GuzzleHttp\\Handler\\{closure}(Object, Array) (Line: 35)\nGuzzleHttp\\PrepareBodyMiddleware->__invoke(Object, Array) (Line: 31)\nGuzzleHttp\\Middleware::GuzzleHttp\\{closure}(Object, Array) (Line: 71)\nGuzzleHttp\\RedirectMiddleware->__invoke(Object, Array) (Line: 66)\nGuzzleHttp\\Middleware::GuzzleHttp\\{closure}(Object, Array) (Line: 75)\nGuzzleHttp\\HandlerStack->__invoke(Object, Array) (Line: 333)\nGuzzleHttp\\Client->transfer(Object, Array) (Line: 169)\nGuzzleHttp\\Client->requestAsync('GET', Object, Array) (Line: 189)\nGuzzleHttp\\Client->request('GET', 'http://127.0.0.1:11343/api/tags', Array) (Line: 106)\nDrupal\\provider_ollama\\OllamaControlApi->makeRequest('api/tags', Array, 'GET') (Line: 49)\nDrupal\\provider_ollama\\OllamaControlApi->getModels() (Line: 62)\nDrupal\\provider_ollama\\Plugin\\AiProvider\\OllamaProvider->getConfiguredModels('chat')\nReflectionMethod->invokeArgs(Object, Array) (Line: 106)\nDrupal\\ai\\Plugin\\ProviderProxy->wrapperCall(Object, Array) (Line: 78)\nDrupal\\ai\\Plugin\\ProviderProxy->__call('getConfiguredModels', Array) (Line: 127)\nDrupal\\ai\\Service\\AiProviderFormHelper->generateAiProvidersForm(Array, Object, 'chat', 'chat_', 2, 1003) (Line: 159)\nDrupal\\ai_api_explorer\\Form\\ChatGenerationForm->buildForm(Array, Object)\ncall_user_func_array(Array, Array) (Line: 536)\nDrupal\\Core\\Form\\FormBuilder->retrieveForm('ai_api_chat_generation', Object) (Line: 375)\nDrupal\\Core\\Form\\FormBuilder->rebuildForm('ai_api_chat_generation', Object, Array) (Line: 633)\nDrupal\\Core\\Form\\FormBuilder->processForm('ai_api_chat_generation', Array, Object) (Line: 326)\nDrupal\\Core\\Form\\FormBuilder->buildForm(Object, Object) (Line: 73)\nDrupal\\Core\\Controller\\FormController->getContentResult(Object, Object)\ncall_user_func_array(Array, Array) (Line: 123)\nDrupal\\Core\\EventSubscriber\\EarlyRenderingControllerWrapperSubscriber->Drupal\\Core\\EventSubscriber\\{closure}() (Line: 638)\nDrupal\\Core\\Render\\Renderer->executeInRenderContext(Object, Object) (Line: 121)\nDrupal\\Core\\EventSubscriber\\EarlyRenderingControllerWrapperSubscriber->wrapControllerExecutionInRenderContext(Array, Array) (Line: 97)\nDrupal\\Core\\EventSubscriber\\EarlyRenderingControllerWrapperSubscriber->Drupal\\Core\\EventSubscriber\\{closure}() (Line: 181)\nSymfony\\Component\\HttpKernel\\HttpKernel->handleRaw(Object, 1) (Line: 76)\nSymfony\\Component\\HttpKernel\\HttpKernel->handle(Object, 1, 1) (Line: 53)\nDrupal\\Core\\StackMiddleware\\Session->handle(Object, 1, 1) (Line: 48)\nDrupal\\Core\\StackMiddleware\\KernelPreHandle->handle(Object, 1, 1) (Line: 28)\nDrupal\\Core\\StackMiddleware\\ContentLength->handle(Object, 1, 1) (Line: 32)\nDrupal\\big_pipe\\StackMiddleware\\ContentLength->handle(Object, 1, 1) (Line: 106)\nDrupal\\page_cache\\StackMiddleware\\PageCache->pass(Object, 1, 1) (Line: 85)\nDrupal\\page_cache\\StackMiddleware\\PageCache->handle(Object, 1, 1) (Line: 48)\nDrupal\\Core\\StackMiddleware\\ReverseProxyMiddleware->handle(Object, 1, 1) (Line: 51)\nDrupal\\Core\\StackMiddleware\\NegotiationMiddleware->handle(Object, 1, 1) (Line: 36)\nDrupal\\Core\\StackMiddleware\\AjaxPageState->handle(Object, 1, 1) (Line: 51)\nDrupal\\Core\\StackMiddleware\\StackedHttpKernel->handle(Object, 1, 1) (Line: 741)\nDrupal\\Core\\DrupalKernel->handle(Object) (Line: 19)\n", name: "AjaxError", stack: "@https://drupal10.ddev.site/sites/default/files/js/js_jQzZ_qeL-aNiRqFwdB8MFdA5vskuvL7sZ7mgMrXNFuQ.js?scope=footer&delta=0&language=en&theme=claro&include=eJx9juEOgjAMhF9obs_gk5BSTpmOdXaF6NsLCiYY46-23-Wux5INdxsphU7HQsnzhxxSzNfqWBSbSGxxwkvYcZaUqFTsYDUyrP44h2qe4eU2Qh_-JDo4E0ktaVino26IuVmvpoKU-_Aejv8V_RmwxXpTwPcyQZcW2b7_elSmguNidm08NyUWhG15AmcicU0:180:2411\n@https://drupal10.ddev.site/sites/default/files/js/js_jQzZ_qeL-aNiRqFwdB8MFdA5vskuvL7sZ7mgMrXNFuQ.js?scope=footer&delta=0&language=en&theme=claro&include=eJx9juEOgjAMhF9obs_gk5BSTpmOdXaF6NsLCiYY46-23-Wux5INdxsphU7HQsnzhxxSzNfqWBSbSGxxwkvYcZaUqFTsYDUyrP44h2qe4eU2Qh_-JDo4E0ktaVino26IuVmvpoKU-_Aejv8V_RmwxXpTwPcyQZcW2b7_elSmguNidm08NyUWhG15AmcicU0:180:19740\n" }
    js_jQzZ_qeL-aNiRqFwdB8MFdA5vskuvL7sZ7mgMrXNFuQ.js:180:2411
    
  • 🇩🇰Denmark ressa Copenhagen

    The challenge is probably to get DDEV connected to an IP on the host machine, I think ...

  • 🇩🇪Germany marcus_johansson

    Yes, either you have to setup Ollama in a DDEV docker container or you can try this for hostname:

    host.docker.internal

    It should work in most cases for connecting to the Docker host.

  • 🇩🇪Germany marcus_johansson

    Also make sure to start Ollama so it listens to world in that case, since it only listens to localhost by default and your DDEV machine is not localhost.

    https://github.com/ollama/ollama/issues/703

  • 🇩🇰Denmark ressa Copenhagen

    Thanks for fast answers @Marcus_Johansson!

    Do you have slightly more concrete suggestions? (like "Insert this in the Ollama config file /home/user/.ollama/config.env, add IP: host.docker.internal in the DDEV config file, and restart DDEV")

    My thinking is that, as soon as I get it working with DDEV (which is the officially recommended developer tool) I'll document it in README, and the AI documentation → . But rather than spending a lot of time experimenting with different solutions from the page you link to, more concrete actionable suggestions would save me a lot of time, which I can use better on documenting using Drupal AI module and Ollama in DDEV.

  • 🇩🇪Germany marcus_johansson

    Ah, so something like this.

    1. Start Ollama with an environment parameter, something like - OLLAMA_HOST="0.0.0.0" ollama serve
    2. When you define the hostname in the page /admin/config/ai/providers/ollama it should be http://host.docker.internal. This is a special hostname that connects to the docker parent host.

    That should work on Linux and 99% work on Mac. On Windows it would be two commands and some special Windows sauce to setup Ollama, so something like

    set OLLAMA_HOST=0.0.0.0
    ollama serve
    approve the firewall

    If it doesn't work on Windows let me know and I'll test on my gaming computer.

  • 🇩🇪Germany marcus_johansson

    I might do a video about it, since with Gemma, it would actually be a showcase for something that you can host on a webserver/your own none-GPU laptop with 16GB of RAM.

  • 🇩🇰Denmark ressa Copenhagen

    Beautiful thorough description, thanks! I'll try it later today, and add it to the documentation after verifying. I am on Debian 12, but if you verify if it works on Windows also, that would be very nice as well.

    A video would be fantastic, and perhaps you could include an uncensored LLM such as dolphin-llama3, to cover that aspect of AI as well? Or do that model instead of Gemma (Google), if the steps are identical? Or do both? :)

  • 🇩🇪Germany marcus_johansson

    @ressa - check out here: https://youtu.be/LFFoGfYFMn4

  • 🇩🇪Germany marcus_johansson

    The steps should be identical - I used Gemma so its a showcase that most modern laptops can run.

  • 🇩🇰Denmark ressa Copenhagen

    Thanks for the video, it's great!

    (It looks like the bottom of the screen is missing, where the commands are shown ... also, perhaps you could consider making the fonts slightly bigger in your videos? But let me emphasize that I very much appreciate your videos, they really are very good. The smallish font is only a minor beauty mark.)

    I tried to follow your tips, and am getting close, though I can't connect to the models ...

    Firewall

    In Debian 12, I opened port 11434 for Ollama in ufw, using Gufw:

    1. Open "Report"
    2. Select "ollama"
    3. Click "+" to create rule
    4. Select "Policy: Allow" and "Direction: Both"

    ... which created this rule:

    $ sudo iptables -S
    [...]
    -A ufw-user-input -p tcp -m tcp --dport 11434 -j ACCEPT
    -A ufw-user-output -p tcp -m tcp --dport 11434 -j ACCEPT

    Ollama

    Stop Ollama and serve with 0.0.0.0 as IP, check before with netstat (install net-tools):

    $ sudo netstat -tunlp | grep 11434
    tcp        0      0 127.0.0.1:11434         0.0.0.0:*               LISTEN      10728/ollama
    $ sudo systemctl stop ollama
    $ sudo netstat -tunlp | grep 11434
    

    The last command gives no result.

    Serve under 0.0.0.0 and check with netstat, and check Ollama IP's:

    $ OLLAMA_HOST=0.0.0.0 ollama serve
    $ sudo netstat -tunlp | grep 11434
    tcp6       0      0 :::11434                :::*                    LISTEN      11027/ollama
    $ curl http://127.0.0.1:11434
    Ollama is running
    $ curl http://0.0.0.0:11434
    Ollama is running
    

    DDEV and host.docker.internal

    Check inside DDEV:

    $ ddev ssh
    $ ping host.docker.internal
    PING host.docker.internal (172.17.0.1) 56(84) bytes of data.
    64 bytes from host.docker.internal (172.17.0.1): icmp_seq=1 ttl=64 time=0.123 ms
    [...]
    $ curl host.docker.internal:11434
    Ollama is running
    

    ... but no models are available:

    $ curl host.docker.internal:11434/api/tags
    {"models":[]}
    

    Ollama Authentication

    Add these values:

    Host Name: http://host.docker.internal
    Port: 11434

    Chat explorer

    When I select Ollama, there are no models in the dropdown ...

    PS. I spent a looong time trying to make it work with port 11343, since it is the port number in the placeholder :-) I'll create a MR to fix this.

  • Status changed to Needs review 5 months ago
  • 🇩🇰Denmark ressa Copenhagen
  • Pipeline finished with Success
    5 months ago
    Total: 150s
    #206064
  • 🇩🇰Denmark ressa Copenhagen

    I found the missing piece of the puzzle. Maybe there's a better way? Anyhow, I needed to also start Ollama, which I assumed ollama serve would take care of, but looks like it doesn't ... I absolutely be mistaken, and doing something wrong?

    I tried to write a list of the steps, and created AI > How to set up a provider → .

  • 🇩🇰Denmark ressa Copenhagen

    Update Issue Summary.

  • 🇩🇰Denmark ressa Copenhagen

    Add link.

  • Status changed to Fixed 5 months ago
  • 🇩🇪Germany marcus_johansson

    Merged and thanks for you work!

    ollama serve should be the main command to make it start listening. You can run ollama without it, via ollama run, so its a little bit strange.

    I added this follow up ticket as good-to-have: https://www.youtube.com/watch?v=LFFoGfYFMn4&ab_channel=DrupalAIVideos

  • 🇩🇰Denmark ressa Copenhagen

    Thanks!

    It appears in your video (I am not sure, since the commands in the video are below the screen, see comment #22 📌 Ollama LLM Provider Active at the bottom) that in your video, you pull the LLM, which also starts it ...

    If you didn't do that step, it might not be running/available, and be available for DDEV perhaps?

    Being able to pull and delete models for Ollama in Web UI would be nice, thanks!

  • Automatically closed - issue fixed for 2 weeks with no activity.

  • 🇱🇹Lithuania mindaugasd

    Simple solutions for

    how to make it easy enough and accessible to regular Drupal CMS users. How does one install it on the server actually. How much knowledge, investment and experience does it require to do it in the proper way.

    is ✨ WebGPU support Active

Production build 0.71.5 2024