Add bundle/type column to entity_revision tables

Created on 14 August 2020, over 4 years ago
Updated 6 March 2025, about 1 month ago

Problem/Motivation

Queries against latest revisions, node_field_revision (or any revisions-base search) that attempt to filter by the type of entity are very slow. This is due to a requirement on a join with the node_field_data table that cannot use any database indexes.

See the below query from a views-based content list with latest revisions.

SELECT
  node_field_data_node_field_revision.langcode AS node_field_data_node_field_revision_langcode,
  node_field_revision.langcode AS node_field_revision_langcode,
  users_field_data_node_field_revision.langcode AS users_field_data_node_field_revision_langcode,
  node_field_revision.vid AS vid,
  node_field_data_node_field_revision.nid AS node_field_data_node_field_revision_nid,
  users_field_data_node_field_revision.uid AS users_field_data_node_field_revision_uid
FROM
  node_field_revision node_field_revision
  LEFT JOIN node_field_data node_field_data_node_field_revision ON node_field_revision.nid = node_field_data_node_field_revision.nid
  AND node_field_data_node_field_revision.langcode = node_field_revision.langcode
  LEFT JOIN users_field_data users_field_data_node_field_revision ON node_field_revision.uid = users_field_data_node_field_revision.uid
  LEFT JOIN node_field_revision node_field_revision2 ON node_field_revision.nid = node_field_revision2.nid
  AND (
    node_field_revision2.vid > node_field_revision.vid
    AND node_field_revision2.langcode = node_field_revision.langcode
    AND node_field_revision2.revision_translation_affected = 1
  )
WHERE
  (node_field_revision2.nid IS NULL)
  AND (
    node_field_revision.revision_translation_affected = 1
  )
  AND (
    node_field_data_node_field_revision.type IN ('event')
  )
ORDER BY
  node_field_revision.changed DESC
LIMIT
  21 OFFSET 40

Note that we are using content moderation on our site.

Output of MySQL EXPLAIN for the above query

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: node_field_data_node_field_revision
   partitions: NULL
         type: ref
possible_keys: PRIMARY,node__id__default_langcode__langcode,node_field__type__target_id,node__type__nid__langcode
          key: node_field__type__target_id
      key_len: 34
          ref: const
         rows: 610
     filtered: 100.00
        Extra: Using where; Using index; Using temporary; Using filesort
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: node_field_revision
   partitions: NULL
         type: ref
possible_keys: node__id__default_langcode__langcode
          key: node__id__default_langcode__langcode
      key_len: 4
          ref: web.node_field_data_node_field_revision.nid
         rows: 2
     filtered: 2.05
        Extra: Using index condition; Using where
*************************** 3. row ***************************
           id: 1
  select_type: SIMPLE
        table: users_field_data_node_field_revision
   partitions: NULL
         type: ref
possible_keys: PRIMARY,user__id__default_langcode__langcode
          key: PRIMARY
      key_len: 4
          ref: web.node_field_revision.uid
         rows: 1
     filtered: 100.00
        Extra: Using index
*************************** 4. row ***************************
           id: 1
  select_type: SIMPLE
        table: node_field_revision2
   partitions: NULL
         type: ref
possible_keys: PRIMARY,node__id__default_langcode__langcode
          key: node__id__default_langcode__langcode
      key_len: 4
          ref: web.node_field_data_node_field_revision.nid
         rows: 2
     filtered: 100.00
        Extra: Using where; Not exists

Steps to reproduce

  • Build a view on moderated content on a database with a decent amount of content.
  • Add a filter to only show the "latest" revision.
  • From the view page (admin/content/moderated or wherever your view lives), perform a filter on a content type that has many revisions.

Note: Our site has 8 content types with less than 100k nodes and around 450k revisions. We noted that filtering against the page content type was significantly faster than queries against any of our custom content types. This might be superfluous, but I note it here for completeness.

Proposed resolution

  • Add a type column on node_field_revision, and possibly onto all ENTITY_TYPE_field_revision tables.
  • Add an index on (type, nid, langcode, vid) to speed up queries for this common request type.
  • Update queries like the one above to use a condition of node_field_revision.type IN ('event') instead of node_field_data_node_field_revision.type IN ('event')

When we did the above on our local environment, the query went from ~55s to ~0.38s

Remaining tasks

Agree on an implementation plan.

User interface changes

None.

API changes

None.

Data model changes

  • New column on ENTITY_TYPE_field_revision tables.
  • Entity system updated to populate this field
  • Core updated to make use of this field when performing type joins on revisions queries.
✨ Feature request
Status

Needs work

Version

11.0 πŸ”₯

Component

entity system

Created by

πŸ‡ΊπŸ‡ΈUnited States merauluka

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

Production build 0.71.5 2024