Add support for language translation, and non-English languages

Created on 6 June 2024, 5 months ago
Updated 26 June 2024, 5 months ago

Problem/Motivation

Would it be possible to add support for non-English languages, both in terms of translation between languages, but also text manipulation?

I see that for example DeepL, which is widely recognized as one of the best online translation services, offers a DeepL API.

Or perhaps translation is possible via one of the now 325,000 HuggingFace models with Inference Endpoints? I guess, if there are LLM's trained for a specific language, that they could be used for text manipulation, possibly also translation?

I am sorry if I am a bit vague ... but I guess I am looking for two things, and these features might be solved by a single third-party service or HuggingFace model ...

  • Translating texts from one language to another
  • Support for text manipulation of non-English languages

Proposed resolutions

Add support for translating, or support for text manipulating non-English languages, for example to be able use the great AI Interpolator MediaWiki module with the Danish or Swedish API's:

Add third-party service support for translation/language support OR document how to set up for example HuggingFace to add translation.

In case this is possible with Inference Endpoint supporting HuggingFace models, document how to do this.

Remaining tasks

User interface changes

API changes

Data model changes

Feature request
Status

Active

Version

1.0

Component

Code

Created by

🇩🇰Denmark ressa Copenhagen

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

  • Issue created by @ressa
  • 🇩🇰Denmark ressa Copenhagen
  • 🇱🇹Lithuania mindaugasd

    Another option is using specialized translation modules https://www.drupal.org/project/tmgmt which has integration of many services
    including openai https://www.drupal.org/project/tmgmt_openai
    Did you try 'tmgmt' option and what was the experience?

    Also I seen an article about Drupal AI translation: https://theaccidentalcoder.com/ai-translation-not-ready-prime-time But it only mentions chatgpt, instead of using GPT-4o. Since you use open source models, you could choose the model and try the quality is it satisfactory for specific language and a goal you want to accomplish.

    Also @ressa, for you awareness, there is new fresh interesting AI project https://www.drupal.org/project/ai

  • 🇱🇹Lithuania mindaugasd

    There are many other AI modules made for content translation, but all chatgpt based.
    As far as I know https://www.drupal.org/project/ai is to enable other service providers.

  • 🇩🇪Germany marcus_johansson

    As @mindaugasd writes, the AI Interpolator is for generating field contents on the same language. If you do have the actual valid use case that you need a field that is for instance Description and then add another field called German Description this would work with the idea of the AI Interpolator. In that case you can actually use a LLM like OpenAI to do this - I have done that on a production websites.

    If you want to first generate all the fields and have an translation of this, there are modules as @mindaugasd points out that has a way better workflow for this than the AI Interpolator. I used tmgmt and its awesome.

  • 🇩🇪Germany marcus_johansson

    Oh with that being said - if your workflow is the valid one I wrote and you actually want to use DeepL, I could look into it. It would be low on the priority for me to do though because its an edge case.

    I will soon release a video on how to make AI Interpolator plugins, so anyone wanting to do this could jump in and help :) It is also documented in boring text documentation already :)

  • 🇱🇹Lithuania mindaugasd

    I used tmgmt and its awesome.

    In that case, translation is solved problem. And OpenAI's GPT-4o is likely to be quite good with translations (part of tmgmt_openai module).

    But next step can be clone tmgmt_openai and create tmgmt_ai or ai_tmgmt module, which would integrate with https://www.drupal.org/project/llm_provider enabling to use any AI model for translation as bonus.

  • 🇩🇰Denmark ressa Copenhagen

    Thank you very much to both of you for fast answers!

    I see that there is also https://www.drupal.org/project/tmgmt_deepl . It's great to hear that there are several options for AI translation in Drupal, and that most likely there are more on the way.

    I will soon release a video on how to make AI Interpolator plugins, so anyone wanting to do this could jump in and help :) It is also documented in boring text documentation already :)

    Sounds awesome!

    I should have included my current use case, which would have helped, sorry about that.

    It is not so much translation in this case, but text manipulation:

    1. Use AI Interpolator WikiMedia to get some text in Danish from Wikipedia pages via https://da.wikipedia.org/w/api.php
    2. Use AI Interpolator rule "HuggingFace Text generation" to manipulate the resulting Danish text

    If I understand it correctly, if I prefer to use HuggingFace (and get a Pro Account $9/month for stability) for both steps, wouldn't a simple method be to find a HuggingFace model supporting Danish, for step 2? So that at 14:20 in the video Use Huggingface models with Drupal in 10 minutes I would instead add a Danish HuggingFace model with Inference Endpoint support?

  • 🇩🇪Germany marcus_johansson

    It would work for sure, if you also want to create Danish texts.

    If its English you want, you could add a step in between with this model perhaps: https://huggingface.co/kaitchup/Llama-2-7b-mt-Danish-to-English Don't know if it's any good though or if it works with inference API.

    We could though create a deepl text to text integration since its free.

  • 🇩🇰Denmark ressa Copenhagen

    Thanks for fast confirmation @Marcus_Johansson, and it is Danish all the way, no translation is needed. That model could be useful, I'll try and see if it might work.

    I totally missed that DeepL API can be used for free ... an AI Interpolator DeepL text-to-text module would be fantastic!

    DeepL API Free

    • Access to all features
    • Access to the DeepL REST API
    • 500.000 character limit / month
    • 1.000 glossaries (for specific languages)

    From https://www.deepl.com/pro -> "DeepL Free"

  • 🇩🇪Germany marcus_johansson

    Ressa, Actually I might make the make a plugin video from this - making new LLM engines will so be very easy if we get the LLM abstraction layer working in the https://www.drupal.org/project/ai module. This is a completely other beast, where it will need to touch on a lot of the different layers of the AI Interpolator.

    Will see if I have time during the weekend to live code that.

  • 🇱🇹Lithuania mindaugasd

    if we get the LLM abstraction layer working in the https://www.drupal.org/project/ai module

    Since work is moving to AI module, I created a new issue there 📌 Create LLM abstraction layer Active to layout my requirements.

  • 🇩🇰Denmark ressa Copenhagen

    That sounds awesome @Marcus_Johansson, thanks! I really like live coding videos, since it gives a great insight into the thought processes, debug techniques, and coding tools in Drupal. So it will be really useful for new Drupal users, as an introduction.

    Thanks for creating that issue @mindaugasd, it's so great to see the AI collaboration in Drupal taking shape.

Production build 0.71.5 2024