Performance edge case: node type index limited to 4 characters

Created on 25 February 2016, over 8 years ago
Updated 1 February 2023, over 1 year ago

Problem/Motivation

I work on a Drupal 7 site with over 13 million nodes, most of which have node type "document". About 300 nodes have node type "document_root". We would expect queries against the "document" type to be slow because of the sheer amount of content, but queries against the "document_root" type to be fast because there are only a small number, and there is an index on the type column. However, this is not the case:

mysql> select sql_no_cache count(*) from node where type='document_root';
+----------+
| count(*) |
+----------+
|      297 |
+----------+
1 row in set (54.10 sec)

This is because the node_type index is limited to 4 characters, so MySQL treats "document" and "document_root" as the same type. This is unexpected and undocumented behaviour, that I traced back to a core commit from 2003: http://cgit.drupalcode.org/drupal/commit/?id=8900590 - this code has survived several refactorings since then.

Drupal 7 defines the key length limit in node_schema():

    'indexes' => array(
      'node_changed'        => array('changed'),
      'node_created'        => array('created'),
      'node_frontpage'      => array('promote', 'status', 'sticky', 'created'),
      'node_status_type'    => array('status', 'type', 'nid'),
      'node_title_type'     => array('title', array('type', 4)),
      'node_type'           => array(array('type', 4)),
      'uid'                 => array('uid'),
      'tnid'                => array('tnid'),
      'translate'           => array('translate'),
      'language'            => array('language'),
    ),

Drupal 8 still explicitly defines the key length for the node__title_type index in NodeStorageSchema::getEntitySchema(), though it has been lost from the node_type index (which is now created automatically).

    $schema['node_field_data']['indexes'] += array(
      'node__frontpage' => array('promote', 'status', 'sticky', 'created'),
      'node__status_type' => array('status', 'type', 'nid'),
      'node__title_type' => array('title', array('type', 4)),
    );

The array($field, $length) format for key lengths is not documented anywhere that I can find. In Drupal 7 this structure is handled by DatabaseSchema_mysql::createKeysSqlHelper() and in Drupal 8 by \Drupal\Core\Database\Driver\mysql\Schema::createKeySql() but both methods have missing docblocks.

This was probably a performance improvement back in 2003, but 13 years later it seems unlikely that this is achieving anything useful except to cause unexpected performance issues in large databases where the first 4 characters of the node type ID happen to coincide.

Proposed resolution

  • Remove the character limit on node type indexes.
  • Document the array($field, $length) format for indexes.

Remaining tasks

  • Decide if this is worth changing.
  • Decide if this is worth backporting to Drupal 7.

User interface changes

None

API changes

None

Data model changes

Database indexes only - faster queries at the expense of slightly larger indexes.

πŸ› Bug report
Status

Needs work

Version

10.1 ✨

Component
DatabaseΒ  β†’

Last updated less than a minute ago

  • Maintained by
  • πŸ‡³πŸ‡±Netherlands @daffie
Created by

πŸ‡¬πŸ‡§United Kingdom longwave UK

Live updates comments and jobs are added and updated live.
  • Performance

    It affects performance. It is often combined with the Needs profiling tag.

Sign in to follow issues

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

Production build 0.71.5 2024