Optimization of revisions query in cm_revision_delete_get_revisions

Created on 10 January 2024, 6 months ago

Problem/Motivation

On a large project we are experiencing a slow query for the following one:

  $database = \Drupal::database();
  $query = $database->select('node_revision', 'nr');
  $query->addJoin(
    'INNER',
    'content_moderation_state_field_revision',
    'cmsfr',
    'nr.nid = cmsfr.content_entity_id and nr.vid = cmsfr.content_entity_revision_id'
  );
  $query
    ->fields('nr', ['vid'])
    ->condition('nr.nid', $nid)
    ->condition('nr.vid', $latest_default_vid, '<')
    ->condition('cmsfr.langcode', $langcode)
    ->condition('cmsfr.revision_translation_affected', 'NULL','<>')
    ->condition('cmsfr.moderation_state', $published_states, 'NOT IN')
    ->orderBy('nr.vid', 'DESC');

  $revisions = $query->execute()->fetchCol();

node_revision table has 64075 rows.
content_moderation_state_field_revision table has 279088 rows.

The result is a slow query log like the following:

SELECT "nr"."vid" AS "vid"
FROM
"node_revision" "nr"
INNER JOIN "content_moderation_state_field_revision" "cmsfr" ON nr.nid = cmsfr.content_entity_id and nr.vid = cmsfr.content_entity_revision_id
WHERE ("nr"."nid" = '6803') AND ("nr"."vid" < '3572528') AND ("cmsfr"."langcode" = 'en') AND ("cmsfr"."revision_translation_affected" <> 'NULL') AND ("cmsfr"."moderation_state" NOT IN (''))
ORDER BY "nr"."vid" DESC;
# User@Host: ] @  [192.168.0.7]
# Thread_id: 9928804  Schema: lvj4ezwkrnyws  QC_hit: No
# Query_time: 3.000977  Lock_time: 0.000158  Rows_sent: 0  Rows_examined: 283363
# Rows_affected: 0  Bytes_sent: 90
# Tmp_tables: 1  Tmp_disk_tables: 0  Tmp_table_sizes: 0
# Full_scan: Yes  Full_join: No  Tmp_table: Yes  Tmp_table_on_disk: No
# Filesort: Yes  Filesort_on_disk: No  Merge_passes: 0  Priority_queue: No
#
# explain: id	select_type	table	type	possible_keys	key	key_len	ref	rows	r_rows	filtered	r_filtered	Extra
# explain: 1	SIMPLE	cmsfr	ALL	NULL	NULL	NULL	NULL	280345	279114.00	100.00	67.09	Using where; Using temporary; Using filesort
# explain: 1	SIMPLE	nr	eq_ref	PRIMARY,node__nid	PRIMARY	4	cmsfr.content_entity_revision_id	1	0.02	0.01	0.00	Using where

Query_time: 3.000977

Do you see a possible way to further optimise this for larger tables?

πŸ’¬ Support request
Status

Active

Version

1.1

Component

Code

Created by

πŸ‡·πŸ‡΄Romania bogdan.racz

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

Production build 0.69.0 2024