Add support for AI submodule "ai_translate" FieldTextExtractor plugin

Created on 12 June 2025, about 2 months ago

Problem/Motivation

The AI module's submodule "ai_translate" provides content translation service through AI.
It provides a plugin called FieldTextExtractor, which allows it to extract textual data from different kinds of field types, in order to pass those to an AI service for translation.

Textual subfields provided by the custom_field module do not get translated through that mechanism, because there is no support for it in this module.

Steps to reproduce

- Install custom_field, ai & ai_translate modules
- Set up a custom field with textual data
- Try to translate an entity containing this field, via the AI translation mechanism

πŸ“Œ Task
Status

Active

Version

3.1

Component

Code

Created by

πŸ‡§πŸ‡ͺBelgium svendecabooter Gent

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

  • Issue created by @svendecabooter
  • πŸ‡§πŸ‡ͺBelgium svendecabooter Gent

    Linking issue in AI module, that complicates this functionality.

  • πŸ‡ΊπŸ‡ΈUnited States apmsooner

    I assume we would need to create our own plugin for field type 'custom' similar to these: https://git.drupalcode.org/project/ai/-/tree/1.2.x/modules/ai_translate/....

    Would have to loop over the subfields and check for subfield translatable setting, max-length, etc... This is all doable but I don't know alot about the AI module yet and how exactly this works so if you want to take a stab at a patch, I'd be happy to review it.

  • πŸ‡§πŸ‡ͺBelgium svendecabooter Gent

    Thanks for the feedback. I'm working on a fix. Might take me a few days to push some code though...

  • πŸ‡§πŸ‡ͺBelgium svendecabooter Gent

    OK I think this should be doable, but it would require the patch in ✨ Don't hardcode 'value' key for textual field translation Active to be committed first.
    Not sure how fast that can happen, so this extra plugin probably won't be committed into custom_field, before that issue is resolved.
    I'll try to get the MR here working in combination with that issue, and hopefully keep it as simple as possible.

    Currently there is some logic that checks if a custom_field subfield / property is translatable or not.
    But ideally that should check if that subfield is textual or not. Is there some mechanism in place to do that?
    I guess just doing checks on CustomFieldType plugin instances? I.e. StringType & StringLongType?

  • πŸ‡ΊπŸ‡ΈUnited States apmsooner

    Thats probably the simplest way. There's technically other subfield types like LinkType that has a title property if you wanted to account for that and more complex types (MapType, MapStringType) but the widgets for those are a bit more complex so you way want to just keep it simple for now.

  • πŸ‡ΊπŸ‡ΈUnited States apmsooner

    Oh and the ImageType has title/alt fields. You could also loop over the property definitions (skip the computed) and account for DataDefinition class of string. The only field that differs there is the StingLongType as it has a custom class to make it work with jsonapi and such. You would still want to check for the subfield being translatable along with these checks. They won't be exposed otherwise and inherit the default language value.

  • Merge request !137Support for AI "ai_translate" module β†’ (Merged) created by svendecabooter
  • Pipeline finished with Success
    about 1 month ago
    Total: 444s
    #524182
  • Pipeline finished with Success
    about 1 month ago
    #524288
  • Pipeline finished with Success
    about 1 month ago
    Total: 292s
    #525341
  • πŸ‡§πŸ‡ͺBelgium svendecabooter Gent

    The logic in the MR now seems to work for most of the custom_field property types I have tested.
    Still WIP:

    • Link field is a challenge. The link itself gets translated currently, since it's a string. However source link https://google.com gets translated in my OpenAI instance to \nhttps://google.com\n, where the extra newline characters mess up the data. Ideally the link itself shouldn't go to translation...
    • Image alt / title texts do not get translated yet, because their property is not a string DataType, but rather an entity reference...
  • Pipeline finished with Success
    about 1 month ago
    Total: 427s
    #525401
  • πŸ‡ΊπŸ‡ΈUnited States apmsooner

    I'll pull your branch down and have a play at it. We can approach this a different way perhaps by targeting the actual specific string properties on the subfields. For example, link & image have specific properties that you'd want to target vs. the actual url or target id. I tried to depict that in the storage outline schema column here: https://www.drupal.org/docs/extending-drupal/contributed-modules/contrib... β†’

  • πŸ‡ΊπŸ‡ΈUnited States apmsooner

    Well I guess I need to figure out how all this works. There's a ton of configuration that makes zero sense to me. I was assuming there was just some button on the node labeled "Translate" or perhaps at the field level. This is a really confusing setup to say the least.

  • πŸ‡ΊπŸ‡ΈUnited States apmsooner

    Okay, i finally got the translate to work in basic form using openai. I was trying with the Groq provider and that was apparently a mistake. I'll pull down your branch for this now and see if I can figure out the remaining issues.

  • πŸ‡ΊπŸ‡ΈUnited States apmsooner

    Just pushed a commit that simplifies things a bit and fixes the link translations issue. The image title/alt is not getting translated now and assume something going on in that loop on line 77 but didn't have time to sort that out yet. You might be able to take it from here though and finish it up based on the revisions I made so far. If not, i can get back into it later.

  • πŸ‡ΊπŸ‡ΈUnited States apmsooner

    FYI, We can later probably account for entity_reference subfields also in the same extractor similar to the ReferenceFieldExtractor. Does the translation occur in the extract() or setValue() methods? I'm unsure still if this is the best way of identifying the values in the extract() or passing the whole subfield and doing it in the setValue(). It would be better to handle in the setValue() IMO but again... just don't completely understand yet whats happening within these methods.

  • πŸ‡ΊπŸ‡ΈUnited States apmsooner

    This last commit seems to be making everything work correctly for me now. Feel free to test it out and see if you're results are also good.

  • πŸ‡ΊπŸ‡ΈUnited States apmsooner
  • πŸ‡§πŸ‡ͺBelgium svendecabooter Gent

    Tested with the changed that got committed into the ai_translate module.
    Everything seems to work now!

  • Pipeline finished with Skipped
    about 1 month ago
    #532352
  • πŸ‡ΊπŸ‡ΈUnited States apmsooner
    • apmsooner β†’ committed fd3667bc on 3.1.x
      Revert "Issue #3529794 by svendecabooter, apmsooner: Add support for AI...
  • πŸ‡ΊπŸ‡ΈUnited States apmsooner

    Reverted this issue in favor of https://www.drupal.org/project/custom_field/issues/3533534 πŸ“Œ AI translate update extractor Active that will move the extractor into a new custom_field_ai submodule.

Production build 0.71.5 2024