- 🇳🇿New Zealand RoSk0 Wellington
Thanks a lot for the
purge_queues
module Jonathan!That's a real game changer! My queue was growing to millions over-pacing purge cron job running every minute. Local tests are great , will see what it would look like on prod.
I believe that
purge_queues
module could be a great addition to thepurge
itself. - 🇳🇿New Zealand ericgsmith
We have been investigating performance issues caused by duplicate items when using purge in combination with
purge_queuer_url
module.We have encountered issues in 2 areas - 1. duplicates in the buffer and 2. duplicates in the queue.
Duplicate items in the buffer
I can see that when an invalidation is created in the
InvalidationsService
it is using ainstanceCounter
to generate a unique integer ID for the invalidation object. When added to the buffer the buffer is callinghas
to see if that ID has been added to the buffer already.Queuers seem to make some attempt to reduce duplicates, e.g by filtering out previously requested tags - but certain situations such as config importing can trigger thousands of duplicates into the buffer, which can lead to high memory consumption.
While I have been looking at this through the context of just the url/path queuer - I wonder if it would be possible for the queuers themselves could set either an id or another property on the invalidation that can be used to dedupe it. E.g - the url registry maintains a list of urls, so the url id could be considered unique. Individual cache tags could also consider themselves unique. Possibly other plugins may have difficulty determining their uniqueness, but opening up the possibility to set id or fallback to an instance counter could help plugins where this is problematic (e.g. the url queuer) to be more efficient.
Without looking through all the code, I would be interested in the maintainers thoughts as it appears the use of
getId
on the invalidation plugin is (according to my IDE) mainly through the buffer and tests.Would there be any reasons against
- changing id getId in InvalidationInterface return type to be string
- introduce a third optional parameter InvalidationService->get to allow an ID to be provided when created
- introduce fallback behaviour for a unqiue id to be generated if not provided
That would then allow queuers to make changes to provide a unique value when creating an invalidation, and the existing buffer deduping code may not need to change.
Duplicate items in the queue
We are using the module @jonhattan provided - but the checks for duplicate items can be problematic for repeated large updates (e.g in our case it was multiple batch calls that each invalidated the
media_list
tag)@RoSk0 raised an idea (offline) of storing an unique identifier for a queued item to make use of upsert queries instead of insert queries using a database queue. We have a proof of concept doing using by hashing the type and expression value of the data, but it would be easier with an enforced / persisted unique ID for an invalidation item. We would be interested in any thoughts on this approach.
- 🇫🇷France O'Briat Nantes
I confirm that duplicated invalidation occur when Drupal is importing or update regularly large volume of content.
A simple solution could be to delete all identical "data" when purging an item?
Or just add a global duplicate deletion at the end of every purger, here's some pseudo code:
"SELECT MAX(item_id), data FROM purge_queue GROUP BY data HAVING COUNT(*) > 1" foreach item_id, data DELETE FROM purge_queue WHERE data=$data AND item_id != item_id
- 🇮🇳India Santhoshkumar
We have identified similar kind of issue when using purge_queuer_coretags module, there are 2 issues we identified as below
- Same cachetags inserted into purge_queue table multiple times.
- Due to duplicated tags inserted into purge_queueu table we facing your queue exceeded 100 000 items ! Purge shut down issue frequently.
To fix the issue we have added the patch duplicate_purge_tags.patch, In this patch we have DB lookup before insert into purge_queue also maintained the array in static array to prevent multiple database calls for same tag.
- last update
7 months ago 489 pass, 22 fail