SchedulerManager::publish query isn't optimised, resulting in slow queries on sites with lots of revisions

Created on 16 December 2022, about 2 years ago
Updated 14 October 2024, 2 months ago

Problem/Motivation

Our site uses drush to call scheduler's lightweight cron every minute, to publish scheduled content.
We have recently noticed that we are getting entries in our mysql slow query log every time it runs, due to the query in SchedulerManager::publish().

Example of the slow query:

# Query_time: 12.723781  Lock_time: 0.000127 Rows_sent: 17  Rows_examined: 35198933
SET timestamp=1671124649;
SELECT "base_table"."vid" AS "vid", "base_table"."nid" AS "nid", "node_field_revision_2"."publish_on" AS "publish_on"
FROM
"node_revision" "base_table"
LEFT OUTER JOIN "node_revision" "base_table_2" ON "base_table"."nid" = "base_table_2"."nid" AND "base_table"."vid" < "base_table_2"."vid"
INNER JOIN "node_field_revision" "node_field_revision" ON "node_field_revision"."vid" = "base_table"."vid"
INNER JOIN "node_field_data" "node_field_data" ON "node_field_data"."nid" = "base_table"."nid"
LEFT JOIN "node_field_revision" "node_field_revision_2" ON "node_field_revision_2"."vid" = "base_table"."vid"
WHERE ("base_table_2"."nid" IS NULL) AND ("node_field_revision"."publish_on" IS NOT NULL) AND ("node_field_revision"."publish_on" <= '1671124636') AND ("node_field_data"."type" IN ('announcement', 'campaign', 'campaign_case_study', 'consultation', 'detailed_guide', 'guide', 'html_document', 'html_document_contents', 'page', 'publication', 'stats_announcement', 'stats_background', 'stats_html_data', 'stats_release', 'stats_series'))
ORDER BY "node_field_revision_2"."publish_on" ASC, "base_table"."nid" ASC;

Running the query through EXPLAIN gives us:

+----+-------------+-----------------------+------------+--------+--------------------------------------------------------------------------+-----------+---------+---------------------------------+---------+----------+----------------------------------------------+
| id | select_type | table                 | partitions | type   | possible_keys                                                            | key       | key_len | ref                             | rows    | filtered | Extra                                        |
+----+-------------+-----------------------+------------+--------+--------------------------------------------------------------------------+-----------+---------+---------------------------------+---------+----------+----------------------------------------------+
|  1 | SIMPLE      | node_field_revision   | NULL       | ALL    | PRIMARY                                                                  | NULL      | NULL    | NULL                            | 1346298 |    30.00 | Using where; Using temporary; Using filesort |
|  1 | SIMPLE      | base_table            | NULL       | eq_ref | PRIMARY,node__nid                                                        | PRIMARY   | 4       | drupal8.node_field_revision.vid |       1 |   100.00 | NULL                                         |
|  1 | SIMPLE      | node_field_revision_2 | NULL       | ref    | PRIMARY                                                                  | PRIMARY   | 4       | drupal8.node_field_revision.vid |       1 |   100.00 | NULL                                         |
|  1 | SIMPLE      | node_field_data       | NULL       | ref    | PRIMARY,node__id__default_langcode__langcode,node_field__type__target_id | PRIMARY   | 4       | drupal8.base_table.nid          |       1 |   100.00 | Using where                                  |
|  1 | SIMPLE      | base_table_2          | NULL       | ref    | PRIMARY,node__nid                                                        | node__nid | 4       | drupal8.base_table.nid          |      17 |   100.00 | Using where; Not exists; Using index         |
+----+-------------+-----------------------+------------+--------+--------------------------------------------------------------------------+-----------+---------+---------------------------------+---------+----------+----------------------------------------------+

Notice there is no key used on node_field_revision. I tried adding a key on publish_on, which then gets used by the query, but it doesn't seem to make a noticeable improvement in query speed.

Can we optimise the query or tables somehow to reduce the query time?

Steps to reproduce

Run drush scheduler-cron on a site with more than 100,000 revisions.

🐛 Bug report
Status

Active

Version

1.0

Component

Miscellaneous

Created by

🇬🇧United Kingdom nicrodgers Monmouthshire, UK

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

Production build 0.71.5 2024