Limit the selected number of nodes when run cron

Created on 7 September 2017, about 7 years ago
Updated 2 February 2024, 10 months ago

Problem/Motivation

When we have a large amount of nodes, like 200k, to run on scheduler we lack of memory, depending on server configuration. This occurs when the scheduler tries to merge and get the unique values from the results returned from the query.

// Allow other modules to add to the list of nodes to be published.
  $nids = array_unique(array_merge($nids, _scheduler_scheduler_nid_list($action)));

Proposed resolution

Add a configuration to select the limit number of nodes to run on every cron.

User interface changes

Add a field to Scheduler settings to handle this number.

Feature request
Status

Needs work

Version

2.0

Component

Code

Created by

🇧🇷Brazil chgasparoto

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

  • First commit to issue fork.
  • Merge request !1132907382: Add limit to cron processing → (Open) created by geoffreyr
  • Open in Jenkins → Open on Drupal.org →
    Core: 10.2.x + Environment: PHP 8.1 & MySQL 8
    last update 10 months ago
    227 pass
  • 🇦🇺Australia geoffreyr

    I had a bit of a go at porting this to D9/10. Not sure that we should call the parameter max_nodes_per_cron anymore, maybe it should be renamed to reference entities in general. publish and unpublish methods take the limit as the command, but default to 0 so the original invocation should continue to work (albeit with no limit).
    Will iterate on this when I have the opportunity.

  • Pipeline finished with Success
    10 months ago
    Total: 580s
    #85747
  • 🇬🇧United Kingdom jonathan1055

    Thanks geoffreyr for opening the MR. Looks good so far, but I wonder if we can simplify it? The SchedulerManager publish() and unpublish() functions are never going to be called with varying values, it will always be the value set in the config options. So instead of adding a parameter, could you just get the setting at the start of the function?

    Also, we could introduce a default of, say 1,000 (?) rather than 'everything' so that we automatically prevent problems if no limit has been set? The the processing could start at the $limit and then $remaining_limit would get deprecated, and the processing stop when it is <= 0. Just an idea, not actually proved this is right, but worth a try.

    We will need test coverage too, but don't let that hinder your progress.

  • Open in Jenkins → Open on Drupal.org →
    Core: 10.2.x + Environment: PHP 8.1 & MySQL 8
    last update 8 months ago
    227 pass
Production build 0.71.5 2024