"Your queue exceeded 100 000 items! Purge shut down"

Created on 30 April 2020, over 4 years ago
Updated 30 August 2024, 2 months ago

Hi,

I have the latest Drupal 8, with Purge module & Cloudflare module.

Suddenly I get this error. I tried everything I could find online to fix this, but with no result :

Purge: Queue size
157998
Your queue exceeded 100 000 items! This volume is extremely high and not sustainable at all, so Purge has shut down cache invalidation to prevent your servers from actually crashing. This can happen when no processors are clearing your queue, or when queueing outpaces processing. Please first solve the structural nature of the issue by adding processing power or reducing your queue loads. Empty the queue to unblock your system.

How can I fix this please ?

Thanks

๐Ÿ’ฌ Support request
Status

Needs review

Version

3.0

Component

Code

Created by

๐Ÿ‡ง๐Ÿ‡ชBelgium Ananda

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

  • ๐Ÿ‡จ๐Ÿ‡ฟCzech Republic kyberman Czech Rep. ๐Ÿ‡จ๐Ÿ‡ฟ

    Hi everybody!

    For a very specific use case (e.g. a lot of nodes being updated at once), the queue processing shutdown could cause more trouble than leaving it workable and letting it process at least partially, I would say. In my case, CloudFlare API allows to purge 30 items at once, multiplied by 2000 possible requests daily. If the queue grows to over 100 000 items quickly, the purging process is stopped immediately. That means there are potentially around 60 000 items that could be processed before the CloudFlare API limit is exhausted.

    The idea this patch brings is to never stop the queue processing, but instead, there is an error logged after the queue grows to 30 000 items, so there is time to recognize and fix the possible issue. Could you please review and comment on this?

    This could be a settings/config/state/hook to override the default 100 000 items limit.
    Another idea is to enqueue the item only if it doesn't exist yet. Any thoughts?

    Thank you
    Vit

  • Status changed to Active over 1 year ago
  • achap ๐Ÿ‡ฆ๐Ÿ‡บ

    Just want to chime in and say this issue has affected me too when running migrations. We often run migrations that can take a few hours or more. I stepped through the code and think I discovered what's happening. In Drupal\purge\Plugin\Purge\Queue\QueueService::add invalidation tags are not added to the queue straight away but rather to an internal buffer. Then at the end of the request (For example a long running cron job or drush script) in Drupal\purge\Plugin\Purge\Queue\QueueService::destruct it looks like the items from the buffer are finally committed to the queue.

    The problem is, during the whole time the migration is running none of the invalidation tags that are generated by the migration can be processed by any of the purge processors. They are all dumped at once at the end of the migration which usually results in being over the 100k limit.

    Not sure what the fix is but that seems to be the root cause of the issue at least for us.

  • achap ๐Ÿ‡ฆ๐Ÿ‡บ

    Our workaround for the above was to re-architect our migration using the Queue API to process 1 item at a time, and give our queue worker a cron lease time of 1 hour (same as cron run interval). This way the buffer is emptied once per hour at least and it doesn't overwhelm the purge queue. Hope that helps someone.

  • ๐Ÿ‡จ๐Ÿ‡ฆCanada rbrownell Ottawa, Ontario, Canada

    This error baffles me. I understand that architecture is the normal solution, but it can't be if the business requirements of the project require timely and rapid updating of a large volume of nodes/pages.

    Please correct me if I am wrong, but it is my understanding that queues are supposed to help prevent server crashing by regulating the volume of data being sent to whatever system is receiving it. This would presumably occur in smaller batches instead of all at once. The fact that the queue stops processing after reaching a certain threshold suggests to me that the queue is not really a proper queue that processes things in smaller batches, but rather a dumping ground which is then sent all at once. There's got to be a better way of handling this than just stopping everything. There are mechanisms that can be added to reduce servers from crashing based on data volume.

  • Status changed to Needs review about 1 year ago
  • Open on Drupal.org โ†’
    Core: 9.5.5 + Environment: PHP 7.4 & MySQL 5.7
    49:58
    49:58
    Queueing
  • ๐Ÿ‡บ๐Ÿ‡ธUnited States japerry KVUO

    Typically this error is probably occuring if cron is misconfigured (or not configured), or during a migration or other process where lots of invalidations are happening at once.

    To counter this edge case, I added a new flag to the state system called purge.dangerous -- if you set this in settings or with drush sset purge.dangerous TRUE then you should be able to have the purger run with over 100,000 items in the queue.

  • To clear purge queue
    drush p-queue-empty

    Add processor
    drush p:processor-add drush_purge_queue_work

  • ๐Ÿ‡ณ๐Ÿ‡ฟNew Zealand xurizaemon ลŒtepoti, Aotearoa ๐Ÿ

    That change mentioned in #19 should be available as of 8.x-3.5. The commit doesn't show in this issue as the commit omits the "Issue #3132524" subject. Looks like the fix was in 70b34944.

  • ๐Ÿ‡ณ๐Ÿ‡ฟNew Zealand xurizaemon ลŒtepoti, Aotearoa ๐Ÿ

    We have a site that is periodically affected by this issue. When investigated, we observe that a single entry in purge_queue table has spiked beyond the 100K limit, which blocks all but manual queue flushes for future operation.

    select distinct max(item_id) sample_id, count(*) as count, from_unixtime(min(created)) as min_created, from_unixtime(max(created)) as max_created, data from purge_queue group by data order by count desc limit 10
    
    +-----------+---------+---------------------+---------------------+---------------------------------------------------------------------------------------------------------------------------+
    | sample_id | count   | min_created         | max_created         | data                                                                                                                      |
    +-----------+---------+---------------------+---------------------+---------------------------------------------------------------------------------------------------------------------------+
    |   9417901 | 1305685 | 2024-05-24 14:15:16 | 2024-06-16 21:46:18 | a:4:{i:0;s:3:"tag";i:1;a:0:{}i:2;s:31:"config:views.view.media_library";i:3;a:0:{}}                                       |
    |   9417786 |    4944 | 2024-05-24 14:15:32 | 2024-06-16 21:45:09 | a:4:{i:0;s:3:"tag";i:1;a:0:{}i:2;s:36:"simple_sitemap:example-org-sitemap";i:3;a:0:{}}                                  |
    |   9417791 |    4944 | 2024-05-24 14:15:32 | 2024-06-16 21:45:09 | a:4:{i:0;s:3:"tag";i:1;a:0:{}i:2;s:36:"simple_sitemap:example-net-sitemap";i:3;a:0:{}}                                  |
    |   9417326 |    4686 | 2024-05-24 14:15:32 | 2024-06-16 21:40:22 | a:4:{i:0;s:3:"tag";i:1;a:0:{}i:2;s:34:"simple_sitemap:example-com-sitemap";i:3;a:0:{}}                                    |
    |   9417321 |    4381 | 2024-05-24 14:15:32 | 2024-06-16 21:40:22 | a:4:{i:0;s:3:"tag";i:1;a:0:{}i:2;s:26:"simple_sitemap:example-sitemap";i:3;a:0:{}}                                            |
    |   9416566 |    4275 | 2024-05-24 14:15:32 | 2024-06-16 21:35:17 | a:4:{i:0;s:3:"tag";i:1;a:0:{}i:2;s:31:"simple_sitemap:example2-sitemap";i:3;a:0:{}}                                       |
    |   9417031 |     188 | 2024-05-26 20:36:28 | 2024-06-16 21:38:15 | a:4:{i:0;s:3:"tag";i:1;a:0:{}i:2;s:9:"node_list";i:3;a:0:{}}                                                              |
    |   9404071 |     176 | 2024-05-26 02:01:34 | 2024-06-16 19:45:47 | a:4:{i:0;s:3:"tag";i:1;a:0:{}i:2;s:19:"config:webform_list";i:3;a:0:{}}                                                   |
    |   9404076 |     176 | 2024-05-26 02:01:34 | 2024-06-16 19:45:47 | a:4:{i:0;s:3:"tag";i:1;a:0:{}i:2;s:23:"webform_submission_list";i:3;a:0:{}}                                               |
    

    If others are observing this issue, I'm interested to know if executing the query above on their site reveals a similar profile - ie that when grouped by data column, the entries in purge_queue are heavily dominated by a single value of data.

  • ๐Ÿ‡ฎ๐Ÿ‡นItaly apaderno Brescia, ๐Ÿ‡ฎ๐Ÿ‡น
  • ๐Ÿ‡ซ๐Ÿ‡ทFrance O'Briat Nantes

    Have a look to the patch of the "Deduplicate Queued Items" issue

Production build 0.71.5 2024