Reduce redundant batch runner

Created on 16 September 2023, 9 months ago
Updated 25 January 2024, 5 months ago

Problem/Motivation

With the batch processor, I found the batch always runs twice.

Steps to reproduce

Create new translation job and run the translation with OpenAI translator.

🐛 Bug report
Status

Needs review

Version

1.0

Component

Code

Created by

🇹🇼Taiwan amourow

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

  • Issue created by @amourow
  • Merge request !5Fix redundant batch process → (Open) created by amourow
  • Status changed to Needs review 9 months ago
  • 🇹🇭Thailand AlfTheCat

    Hi @amourow, I tried to apply the patch but (and this is perhaps caused by me applying the diff patch of Make the translator able to translate longer HTML content Fixed ) my terminal reports that the patch is already applied.

    When I reinstall the module and try the patch it fails:

    sudo -u www-data patch -p1 < 5.diff
    patching file src/Plugin/tmgmt/Translator/OpenAiTranslator.php
    Hunk #1 FAILED at 351.
    1 out of 1 hunk FAILED -- saving rejects to file src/Plugin/tmgmt/Translator/OpenAiTranslator.php.rej
    patching file tmgmt_openai.install

    The batch process does indeed seem to run in a funny way. I attached a brief recording of it.

    I suppose it will be easier when we have the dev branch available in composer 💬 Can we make 1.0.x-dev available for composer Needs work , thanks for opening that issue.

  • 🇹🇼Taiwan amourow

    @AlfTheCat

    The screen recording is funny.

    Did you make one translation request or you did multiple at once?

    Now the 1.0.x-dev can be installed via composer, would you try only include the MR patch from this issue?

  • 🇹🇭Thailand AlfTheCat

    @aumoro yeah it sure is! I requested translations for a single language but for multiple menu links in that video. I've configured the number of jobs to run at cron to 3, in the openai provider settings. I think that's where the "1 of 3" comes from but what exactly is happening under the hood is hard to tell.

    I'll try this again in a different environment and using the dev branch now that it's available through composer, which is great news!

  • 🇹🇭Thailand AlfTheCat

    I now used the dev branch and then applied this patch, which worked this time. However the batch still seems to run strangely.

    It seem like, if I submit 10 nodes that have 7 translations and 5 translatable fields, the batch shows the progress of translating the fields. So it says 1/5 instead of 1/10 nodes. I think this is why it starts over again before the progress bar completes. I don't know if it's also double-submitting the field values.

    Hope this helps.

  • 🇹🇼Taiwan amourow

    The scope of batch process in this module is for one translation and one node.

    In your case, the batch progress shows you 1/5 is correct to me.
    If you have a long text field with long text in it, the batch process may have more steps, because it will split the text into chunks and call the OpenAI API in separated steps.

    TMGMT job has it's own batch process for multiple translations and nodes, I believe.
    Let's focus on one translation one node here.

  • 🇹🇭Thailand AlfTheCat

    OK thanks, that clarifies things. The batch running the way of my video is a bit counter-intuitive and confusing but I understand now we're not there yet :). I tested this now against the latest dev but I run into 🐛 Drupal\Core\Entity\EntityStorageException: Update existing 'tmgmt_job_item' entity while changing the ID is not supported Active , which also happens without applying the patch so that must be related to the dev version. I rolled back to RC2 and it works again.

  • 🇹🇭Thailand AlfTheCat

    Interesting observation, in my case actually the duplicate jobs are only created if I request translations via the main TMGMT dashboard at admin/tmgmt/sources, using RC2.

    If I request them from the node translation tab, no duplicate jobs are created.

  • 🇹🇭Thailand AlfTheCat

    I did a fresh install and using the latest dev, and the patch from this issue. The duplication no longer occurs when translating either single entities to a single language, or to multiple languages and multiple entities from the main TMGMT Sources UI. It works well from the "Translate" tab on entities too.

    The batch process is a bit confusing when translating multiple entities, and perhaps also a bit when translating a single entity because the progress bar will run all the way back to "Completed 1/*." after first completing all but the final iteration. When it runs again, the job is already marked as finished in the TMGMT Jobs overview, and temporarily in the Sources overview the entities show up as untranslated (not in progress or any other status) and translations do not yet appear until the process finishes the second run.

    I suppose the batch api has its limitations outside the scope of this issue, so the issue itself seems to be resolved :)

  • 🇹🇼Taiwan amourow

    @AlfTheCat, thanks for the updates, good to know it resolved the redundant batch. I finally have time to review the issues now.

    I hope someone can also have a look here and let the maintainer know this is RTBC.

  • 🇹🇭Thailand AlfTheCat

    Hi @amourow, awesome! Also looking forward to additional testers :)

  • 🇫🇷France NicolasGraph Strasbourg

    Hi @amourow, I think the queue worker and the typo fix in the .install should not be in the current merge request as it is not related to the issue. There was an interesting talk from @xjm on code review and good contributing practices in DrupalCon Lille; slides are available here : https://drive.google.com/file/d/1o6wIX75wXI_XE80NFj_QRJwkM578sFnS/view.
    I also suggest you take a look at the tmgmt_deepl module and the way the queue worker is handled as for now you do not create queue items at all as far as I can see. And the module also use the batch API.

  • 🇫🇷France NicolasGraph Strasbourg

    @amourow, I can't find batch processes running twice on 1.0.x-dev.
    Can you give it a try?

Production build 0.69.0 2024