Is there a way to trigger a batch directly from code?

Created on 29 August 2024, 3 months ago
Updated 30 August 2024, 3 months ago

Love this module, great job! Custom batch ops are finally easy. I do have a couple of questions/feature requests though.

Triggering a batch directly from custom code

Maybe this is already possible, but I can't find anything in the docs about. The only way to do it in code is from the update or deploy hooks. Which are scenarios that I rarely need. Cron triggers are great, but a way to manually fire it off in my own code would be nice. The run() command used in the hooks seems to require parameters that may not be available or necessary.

Breaking a large task into multiple batches

The core batch api supports breaking a single batch call into "multiple" batches. For example, processing 1000 nodes. A single triggering of the batch operation can break the task into multiple batch operations in succession. Say 50 nodes at a time. It's more efficient than making each node a single batch operation. But as near as I can tell, that's what your module does. You load and array of IDs and the the batch processes them 1 at a time, but I'm guessing each node is a distinct batch operation, which is fine for simplicity, but not very efficient. Is this the case? Is there a way to process them in groups? Is it already doing that? If so is there a way to control the number of items in each group?

✨ Feature request
Status

Fixed

Version

1.0

Component

Miscellaneous

Created by

πŸ‡ΊπŸ‡ΈUnited States cdesautels

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

  • Issue created by @cdesautels
  • πŸ‡ΊπŸ‡ΈUnited States swirt Florida

    Hi cdesautels,
    Thank you for taking the time to raise these very good questions.

    Triggering from code

    Yes I need to make a good example of how to do this. It would look something like this:

    // Establish a sandbox to keep state across multiple runs.
    $sandbox = [];
    // Initiate the Finished state as not even started.
    $sandbox['#finished'] = 0;
    // If this is custom you likely want it to fail gracefully on errors.
    $allow_skip = TRUE;
    $script = \Drupal::classResolver('\Drupal\MY_MODULE_NAME\cbo_scripts\SCRIPT_NAME';
    do {
      $script->run($sandbox, 'MY CUSTOM NAME', $allow_skip);
    } while ($sandbox['#finished'] < 1);
    
    

    However, this is not a true batch. as it will not invoke the batch api and progress bar with repeated ajax calls. This means it is subject to timeouts since it is just one process running until it timesout or finishes, whichever comes first. So I would shy away from this for things that are processing more than a few hundred items. However, even if it did time out, the next time you ran it, it would pick up where it left off, so it might work with your specific use case.

    I have it in the roadmap (a feature request exists) to have an option to add items to a queue and this would be better suited for calling large quantities by custom code.

  • πŸ‡ΊπŸ‡ΈUnited States swirt Florida

    Breaking a large task into multiple batches

    THere are a couple ways around this to try to make it more performant.

    A:
    You don't have to pass node IDs in the list of items from gatherItemsToProcess() You could actually perform a node load multiple and pass all of them as the array to process. If however you had a large. number of items to load this would not be recommended since maybe you can't load that many items into memory at one time.

    B:

    You could decide on a batch size and then to an array_slice on $sandbox['items_to_process'] inside of process() and use what you sliced to loop through those. It requires a bit more active coding though and could get a little sketchy in terms of making sure each of your batch size gets logged correctly.

    C:
    Much of the wiring is in place for this but not all of it yet https://git.drupalcode.org/project/codit_batch_operations/-/blob/1.0.x/s...
    Wait for me to figure out the rest or help me figure out the rest (contribute). The main hangup is that even though I know the batch size, I can not standardize the loading of the items, because each use case might be different, some are loading nodes, some terms, some ... So even if I know the batch size for how many to load, I don't know how to load them so that it could load multiple. I likely need to extract the loading from the processing, but then things become harder to explain to people wanting to use it. So for now it defaults to loading and processing one item at a time. It may be a little slower, but more it is more reliable and more explainable.

  • πŸ‡ΊπŸ‡ΈUnited States swirt Florida

    I am going to add a little code to basically wrap up the example code I gave you to make that easier to call with just 2 lines. I should have it for you tonight. :)

  • πŸ‡ΊπŸ‡ΈUnited States swirt Florida
    • swirt β†’ committed 01c58426 on 1.0.x
      Issue #3471007 by swirt, cdesautels: Add method to run BatchOperation...
  • Status changed to Fixed 3 months ago
  • πŸ‡ΊπŸ‡ΈUnited States swirt Florida
  • πŸ‡ΊπŸ‡ΈUnited States swirt Florida

    The ability to more cleanly call the BatchOperation from custom code has been added to release 1.0.4 β†’

    Performing the following in your custom code should run it.

          $script = \Drupal::classResolver('\Drupal\MY_MODULE_NAME\cbo_scripts\SCRIPT_NAME');
          $script->runByCustomCode('CUSTOM EXECUTOR IDENTIFIER', $allow_skip = TRUE);
    
  • πŸ‡ΊπŸ‡ΈUnited States cdesautels

    Thanks, you really jumped on that. I tested your change in 1.0.4 and it works fine. Just for clarity though, this suffers from the same problem you're more verbose code does? That is, it's not a true batch?

    At for the other question. I think another approach is to just return the data from gatherItemsToProcess() as a 2 dimensional array, and write the callback in processOne() to loop through each entity in each nested array which is seen by processOne() as a single item. I suspect there'll still be issues with logging though.

  • πŸ‡ΊπŸ‡ΈUnited States swirt Florida

    Yes it this new additions suffers the same problem as the verbose... because it is the same thing. It is not a true batch so it has the potential to be subject to a timeout if the batch or the process is too big/intensive.

    For the second one... yah I think I am getting closer to a way to pull off sub-batches, so that loading multiple can make it a bit more performant. Thank you for getting me contemplating that some more.

  • Status changed to Fixed 3 months ago
  • πŸ‡ΊπŸ‡ΈUnited States swirt Florida

    I am going to close this issue. If you have other feature requests or questions, feel free to open new issues.

Production build 0.71.5 2024