When we have a huge number of nodes the cron job does not work (504 gateway timeout)

Comment over 2 years ago →
System Message

kaszarobert → authored 6b4109e7 on 4.0.x
Issue #3159018 y kaszarobert: Limit the processed node count during cron...
Status changed to Needs work over 2 years ago1:40pm 13 June 2023
Comment over 2 years ago →
🇸🇰Slovakia kaszarobert
The problem for these timeouts are that the module refreshes pageviews for every node. The module does the processing as the following:

Cron starts

First, if there are no more queue items left, queries the amount of URLs from GA4

Based on how many chunks you have set (default is 1000), it creates that many to process every result in separate queue item. For example if there're 2500 URLs total, then it creates 3 queue items for 3 chunks: 1-1000, 1001-2000, 2001-2500

Then the currently non-scalable code: create every queue item for every published node for the site. That could absolutely lead to timeout if this huge queue could not be created as this one process times out.

Queue processing starts

There were 3 queue items created to update the URL-pageview counts in the database. We save that information to the google_analytics_counter table.
Following are the queue items that collect all the content URL & URL alias for each and every node. These should only run if all the 2500 URLs were collected from GA, otherwise wrong number will be calculated with stale data.

Queue processing ends

I pushed one possible solution for timeouts in branch 4.0.x with 2 settings where you can limit the size of the queue:
- Update pageviews for content created in the last X days
- Update pageviews for the last X content

This will go fine for a hotfix but the proper solution would be a major rewrite for cases I can think of:
- Don't build the whole queue with millions of items, but give a setting for the user to limit this. Also this way, cron process needs to reliably track, how many nodes were put to the queue and how many should be added.
- Don't use queues at all. For each cron run, save the last processed nid, and just process the following X amount at each cron run.

I don't know if we should backport this solution to 8.x-3.x as July 1st, 2023 is so close and version 8.x-3.x becomes useless as Universal Analytics will stop working and fall 2023 Universal Analytics data will no longer be available.
Comment over 2 years ago →
🇸🇰Slovakia kaszarobert
Comment over 2 years ago →
System Message

kaszarobert → authored f25b6fb3 on 4.0.x
Issue #3159018 by kaszarobert: Update hook added to clear queue because...
Status changed to Fixed over 2 years ago12:44pm 20 June 2023
Comment over 2 years ago →
🇸🇰Slovakia kaszarobert
I decided to go with a similar solution I advised back then in ✨ Make google_analytics_counter_cron() faster Fixed . So from now on, during cron run:
- if queue is not empty, skip and let the queue processors finish processing every queue items
- if queue is empty, then get the amount of URLs from GA4
- save (amount of URLs / chunk setting) "fetch" queue items, for example for 4500 URLs with 1000 chunk setting means creating 4 "fetch" queue items: 0-1000, 1001-2000, 2001-3000, 4001-4500.
- Now is the change I made: instead of collecting all the published Node IDs into separate queue items, we just create exactly 1 "count" queue item for the first node.
- Then, the queue processing starts and when the "count" queue item processing finishes, instead of exiting, it creates the next 1 "count" queue item. And the queue processor immediately sees that there's still 1 queue item, and if there's time (the default is 120 seconds), it processes it. And again it creates the next 1 "count" queue item while there are nodes left to process. That way we don't make undesirable amount of writes in database and won't hit PHP process timeout. The downside of this solution is that it can't be run in parallel because there will be 1 "count" queue item only, but I think it has more advantages than disadvantages considering scaling the site.
Comment over 2 years ago →
System Message
Automatically closed - issue fixed for 2 weeks with no activity.

When we have a huge number of nodes the cron job does not work (504 gateway timeout)

Comments & Activities