Add text extractor plugins for image and link field types

Created on 23 June 2025, 29 days ago

Problem/Motivation

Text extractor plugin operates with the string field types (string, string_long etc.) and assumes translatable text is always in the "value" column.

This is not the case for composite field types like image (having translatable "alt" and "title" columns) and link ("title" column).

Proposed resolution

Add text extractor plugins for image and link field types
Check core field types to find other potentially translatable field types.

✨ Feature request
Status

Active

Version

1.2

Component

AI Translate

Created by

πŸ‡§πŸ‡¬Bulgaria valthebald Sofia

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

  • Issue created by @valthebald
  • πŸ‡§πŸ‡ͺBelgium svendecabooter Gent

    Seems related to https://www.drupal.org/project/ai/issues/3529669 ✨ Handle link field translation as well Active

  • πŸ‡ΊπŸ‡ΈUnited States apmsooner

    Probably the 'file' field type might be worth mentioning also as there is a description property.

  • πŸ‡ΊπŸ‡ΈUnited States apmsooner
  • πŸ‡§πŸ‡¬Bulgaria valthebald Sofia

    @svendecabooter definitely related! Thanks for pointing out

  • Merge request !703Support additional fields. β†’ (Merged) created by apmsooner
  • πŸ‡ΊπŸ‡ΈUnited States apmsooner
  • Pipeline finished with Failed
    25 days ago
    Total: 464s
    #533358
  • Pipeline finished with Success
    25 days ago
    Total: 295s
    #533369
  • πŸ‡ΊπŸ‡ΈUnited States apmsooner

    The link->title & text_with_summary->summary field types are now being translated with this patch. For some reason the file and image field types are failing with this error:

    InvalidArgumentException: Invalid translation language (und) specified. in Drupal\Core\Entity\ContentEntityBase->getTranslation() (line 903 of /var/www/html/web/core/lib/Drupal/Core/Entity/ContentEntityBase.php).
    

    Anyone that can help test and find a solution would be appreciated.

  • πŸ‡ΊπŸ‡ΈUnited States apmsooner
  • πŸ‡ΊπŸ‡ΈUnited States apmsooner

    Noting here that the patch in: https://www.drupal.org/project/drupal/issues/3386915 πŸ› InvalidArgumentException: Invalid translation language (en) specified Active makes file & image fields work.

  • πŸ‡ΊπŸ‡ΈUnited States apmsooner
  • πŸ‡ΊπŸ‡ΈUnited States apmsooner

    Everything should work in this branch for AI translate's role. The 1 caveat is file/image properties that is a core bug: https://www.drupal.org/project/drupal/issues/3386915#comment-15228372 πŸ› InvalidArgumentException: Invalid translation language (en) specified Active . I don't know how if we would want to reference the patch being needed to get those field types working or put on hold.

  • πŸ‡§πŸ‡ͺBelgium svendecabooter Gent

    Tried testing the MR, but the changes to the modules/ai_translate/src/FieldTextExtractorInterface.php break contrib FieldTextExtractor plugin implementations, such as in "ai_translate_lb_asymmetric" or "custom_field" module that I'm using.
    There is no way we can make this work without breaking backwards compatibility?

  • πŸ‡ΊπŸ‡ΈUnited States apmsooner

    I can easily patch custom_field to work with the changes and I'll take a look at the layout builder issue, though i had already addressed that one to work. The idea with the MR was to instantiate the plugin with the entity object as configuration instead of having to pass it to all these functions where it's not even necessary in most cases. Generally with this MR, any field type can be supported by just setting the known translatable properties in the getColumns() method and thats it. The rest of the logic should just work from the base class so we eliminate having to duplicate code everywhere.

  • Pipeline finished with Failed
    21 days ago
    Total: 186s
    #536560
  • Pipeline finished with Success
    21 days ago
    Total: 342s
    #536573
  • πŸ‡ΊπŸ‡ΈUnited States apmsooner

    @svendecabooter,

    I created a patch for custom_field related in this issue. I didn't catch the other contrib module ai_translate_lb_asymmetric... I thought you were originally just talking about the LbFieldExtractor which works now. That contrib module would need to be updated I guess now too but I don't even particularly understand why its a separate module. Can we not solve for it in ai_translate? There's too many moving parts around all these fields and I've finally got everything working as it should now after some refactoring. The reference extractor for example had some hardcoded field types to translate which would never account for custom fields, image, file, etc... so I fixed that.

  • πŸ‡ΊπŸ‡ΈUnited States apmsooner
  • πŸ‡ΊπŸ‡ΈUnited States apmsooner

    Re-rolled core patch here for 10.5.x: #25 https://www.drupal.org/project/drupal/issues/3386915#comment-16174131 πŸ› InvalidArgumentException: Invalid translation language (en) specified Active to get file/image fields working.

  • πŸ‡ΊπŸ‡ΈUnited States apmsooner
  • πŸ‡ΊπŸ‡ΈUnited States apmsooner
  • πŸ‡§πŸ‡ͺBelgium svendecabooter Gent

    @apmsooner:

    Yeah I reckoned the module would need to be updated. Was just checking if there might be ways for backwards compatibility that I didn't think about. If this refactor goes into 1.2.x branch, I guess modules that extend this functionality, would need to create a separate branch for AI 1.2.x compatibility. There is also https://www.drupal.org/project/ai_translate_paragraph_asymetric β†’ that provides a similar plugin.

    The custom FieldTextExtractor plugins for the assymetric modules could probably also be included into the ai_translate module, rather than a separate module, but then the FieldTextExtractor plugin might need some extra logic to decide whether the plugin needs to kick in or not. E.g.:
    - for asymmetric LB translation --> check if layout_builder_at module is enabled
    - for asymmetric Paragraphs translation --> check if paragraphs_asymmetric_translation_widgets module is enabled
    Or could that logic go into shouldExtract()? Guess not, since that's more about logic on field level, than general condition checking...
    Now this logic is enforced by those separate ai_translate_* modules having the dependencies set in their info.yml file. If they got included in ai_translate itself, they might be activated where they shouldn't (i.e. if translation logic is symmetric instead)

  • πŸ‡ΊπŸ‡ΈUnited States apmsooner

    @svendecabooter,

    The custom FieldTextExtractor plugins for the assymetric modules could probably also be included into the ai_translate module, rather than a separate module, but then the FieldTextExtractor plugin might need some extra logic to decide whether the plugin needs to kick in or not.

    Maybe this is handled as a third party setting or just simply in the ai config settings. I think the existing lb plugin could then first check for that setting and do call alternative functions or something?

  • πŸ‡ΊπŸ‡ΈUnited States apmsooner
  • πŸ‡ΊπŸ‡ΈUnited States apmsooner
  • Pipeline finished with Success
    18 days ago
    Total: 183s
    #539130
  • πŸ‡§πŸ‡ͺBelgium svendecabooter Gent

    Thanks for your work on this refactor / improvement @apmsooner.
    I've been testing it out, and it works very well.

    Tested with simple fields on a node, with a custom_field setup (with mentioned custom_field branch checkout), and a complex entity reference setup multiple levels deep.

    I have added an additional commit for ReferenceFieldExtractor, that implements the logic I described in the MR comments.
    It removes the needs for translatable_properties annotations, since it infers whether they are translatable or not, from the relevant (recursive) FieldTextExtractor plugin that gets called for that field.

    Tested the same scenario's as described above after this change, and functionality kept working.
    Let me know if you think this isn't a correct approach, then the commit can be reverted in the MR I guess.

  • Pipeline finished with Failed
    18 days ago
    #539184
  • πŸ‡§πŸ‡ͺBelgium svendecabooter Gent

    Based on feedback above, I have also incorporated the logic of the ai_translate_lb_asymmetric module into the LbFieldExtractor.
    This provides a single FieldTextExtractor for the Layout Builder field type, reducing conflicts between modules / plugins.
    Based on the selected Layout Builder strategy for a given installation, it takes care of the proper AI translation.

    This eliminates the use of the separate ai_translate_lb_asymmetric module. When this improvement gets committed, it seems fair to credit author(s) of that module as well.

  • πŸ‡§πŸ‡ͺBelgium svendecabooter Gent

    FYI: I have not tested the latest version of the MR yet with a regular Layout Builder setup (without layout_builder_at module installed).

  • πŸ‡ΊπŸ‡ΈUnited States apmsooner

    @svendecabooter,

    I pulled in your changes and I'm getting errors for all the extractors that get called from the ReferenceFieldExtractor. They are all similar to the following.

    Warning: Undefined array key 0 in Drupal\ai_translate\Plugin\FieldTextExtractor\TextFieldExtractor->setValue() (line 43 of /var/www/html/web/modules/contrib/ai-3531717/modules/ai_translate/src/Plugin/FieldTextExtractor/TextFieldExtractor.php)
    #0 /var/www/html/web/core/includes/bootstrap.inc(166): _drupal_error_handler_real(2, 'Undefined array...', '/var/www/html/w...', 43)
    #1 /var/www/html/web/modules/contrib/ai-3531717/modules/ai_translate/src/Plugin/FieldTextExtractor/TextFieldExtractor.php(43): _drupal_error_handler(2, 'Undefined array...', '/var/www/html/w...', 43)
    #2 /var/www/html/web/modules/contrib/ai-3531717/modules/ai_translate/src/Plugin/FieldTextExtractor/ReferenceFieldExtractor.php(220): Drupal\ai_translate\Plugin\FieldTextExtractor\TextFieldExtractor->setValue('title', Array)
  • πŸ‡ΊπŸ‡ΈUnited States apmsooner

    I tried messing with it to get past the errors but it seems like its just no longer even creating the translation for the referenced entity at all anymore. Seems you may have stripped out alot of logic that is needed so unless you can revise to get it working, i'd say revert.

  • πŸ‡§πŸ‡¬Bulgaria valthebald Sofia

    I feel that we're stuck somehow, would it make things easier to split this issue into multiple (by field type)?

  • πŸ‡ΊπŸ‡ΈUnited States apmsooner

    No, it was working fine prior to the commit mentioned in #24. Waiting for @svendecabooter to either revert or come up with alternative solution. Personally I think where I left it was adequate and he can build the additional logic for the async lb module on top of that but giving him time to address first.

  • πŸ‡§πŸ‡¬Bulgaria valthebald Sofia

    @apmsoon: good, thanks for clarification!

  • Pipeline finished with Failed
    15 days ago
    Total: 208s
    #540730
  • Pipeline finished with Failed
    15 days ago
    Total: 257s
    #540742
  • πŸ‡§πŸ‡ͺBelgium svendecabooter Gent

    @apmsooner

    I have tested with the following setup:

    - Drupal 10.5.1
    - AI module with this MR branch checked out
    - Core patch applied: https://www.drupal.org/project/drupal/issues/3386915#comment-16174131 πŸ› InvalidArgumentException: Invalid translation language (en) specified Active
    - Configured core content_translation and language modules.
    - Enabled AI Translate module + set up "Entity reference translation" settings to include Content block / File / Content / Paragraph / Taxonomy term - max reference depth set to 5.
    - Created content type with a bunch of fields (plain text, body, image, term reference) + Paragraphs module entity reference (paragraph had same type of fields added: plain text, body, image, term reference).
    - Created a 2nd content type with a bunch of fields (plain text, body, image, term reference) + enabled Layout Builder. Created a block type with plain text field, body field, image field, term reference field

    I encountered 2 issues:
    - "Warning: Undefined array key 0" PHP warning - although functionality kept working. Fixed now
    - Symmetric layout builder translation didn't work fully yet (since it was untested as stated above). Added fix for that as well now.

    I'm unsure how to reproduce your behaviour that translation doesn't work at all.
    Maybe there is some specific configuration that is interfering, since it requires a bit of setup to get all pieces in place...
    Perhaps you can share a config export of your installation via Slack, so I can try to reproduce the unwanted behavior.

    I have tested now with 2 different setups (D10.5 with Paragraphs & symmetric LB + D11.2 with asymmetric LB + custom_field) and can get everything translated as expected...

  • πŸ‡ΊπŸ‡ΈUnited States apmsooner

    Your changes are making it work for me now. I did push up a change to the custom_field patch to prevent the same warning for undefined array key 0. My setup is i have a basic page with an entity reference field to an article node. The article node has filefield, image and custom field. All of them get translated when the basic page gets translated however the only weird thing is the custom field has an image and file subfield. The image alt text gets translated but the target id value doesn't get set for some reason when the translation is driven off the reference extractor. When i translate the article directly, the custom field image and file target ids get set like they should. I can't remember if they were getting set before your update or not. Seems like the original value I guess isn't getting copied over in that case for some reason. Perhaps you can try this same setup on your end with the updated custom_field patch?

  • πŸ‡ΊπŸ‡ΈUnited States apmsooner

    I figured out the issue I was having and pushed a commit. The original value wasn't being set to the translated entity before being passed to the extractor. Everything now seems to be working now for me with that fix!

  • Pipeline finished with Failed
    15 days ago
    Total: 364s
    #540791
  • πŸ‡ΊπŸ‡ΈUnited States apmsooner

    @valthebald,

    Everything is working properly for me after recent changes from @svendecabooter so putting back to needs review for others. Probably need to document the need for the core patch somewhere if properties on file/image fields are desired to be translatable but otherwise, I'm having consistent success with this patch.

  • Pipeline finished with Success
    15 days ago
    Total: 237s
    #540805
  • πŸ‡ΊπŸ‡ΈUnited States apmsooner

    Add extra test step for custom_field to enable sub-module custom_field_ai.

  • πŸ‡§πŸ‡¬Bulgaria valthebald Sofia

    @apmsooner: I am back from vacation, and will review this MR ASAP

  • πŸ‡§πŸ‡¬Bulgaria valthebald Sofia

    Copying from Slack:
    Having a base class for text extractors sounds good (and it's really easy to add a new plugin!)
    One thing I didn't understand is why change FieldTextExtractorInterface?
    i.e. passing an entity in extractor constructor instead of setValue(), extract() etc. calls means that we need extractor instance per field per entity, when in the current state, it's extractor instance per field type

  • Pipeline finished with Failed
    7 days ago
    Total: 211s
    #548078
  • Pipeline finished with Failed
    7 days ago
    Total: 341s
    #548107
  • Pipeline finished with Success
    7 days ago
    Total: 256s
    #548119
  • πŸ‡§πŸ‡¬Bulgaria valthebald Sofia

    Committed to 1.2.x only because of the new method getColumns() in FieldTextExtractorInterface
    Thanks everyone!

  • πŸ‡§πŸ‡ͺBelgium svendecabooter Gent

    @valthebald
    Can you also credit user "arwillame", because a chunk of code for Layout Builder translation support got taken from https://www.drupal.org/project/ai_translate_lb_asymmetric β†’

  • πŸ‡§πŸ‡ͺBelgium arwillame Belgium πŸ‡§πŸ‡ͺ
  • πŸ‡§πŸ‡¬Bulgaria valthebald Sofia

    @svendecabooter: done! Thanks for pointing this out

  • πŸ‡§πŸ‡ͺBelgium svendecabooter Gent

    Do we also need to document that file / image extraction will fail without the core patch πŸ› InvalidArgumentException: Invalid translation language (en) specified Active ?

  • πŸ‡§πŸ‡¬Bulgaria valthebald Sofia

    @svendecabooter I haven't experienced this, will check

  • πŸ‡ΊπŸ‡ΈUnited States apmsooner

    @valthebald,

    Add an image or file field to your entity and make it translatable. Without the patch, you should get error during translation extraction. Unfortunately there's a test fail in the patch that is blocking it from getting merged so yes we should probably document it somewhere just so users don't think there's something wrong with ai_translate.

Production build 0.71.5 2024