Performance: Consider using a custom table rather than key_value for pathauto state storage

Created on 17 February 2024, 4 months ago
Updated 20 February 2024, 4 months ago

Problem/Motivation

Currently, pathauto stores its `pathauto_state.*` collection data in the key_value table. On our site, that means that we have:

collection key count
pathauto_state.media 1931155
pathauto_state.node 1137720

over 3 million rows in the key_value table used by pathauto. Since key_value is heavily used by Drupal, this leads to a sag in performance hits for any meaningfully sized web site.

Steps to reproduce

1. Create a Drupal site. Install pathauto.
2. Use devel:generate to generate 10,000 items of content. Measure page load speed both cached and uncached.
3. Use devel:generate to generate 3,000,000 items of content (this will take a long long time). Measure page load speeds.

Proposed resolution

Give pathauto its own data table to use so other processes have less work to do. This would also benefit pathauto directly in that we would no longer have to use serialization to read/write the data, but, given control over the schema of this new table, could massively reduce drag within pathauto itself by storing the data in discrete, normalized rows instead.

Remaining tasks

Design a pathauto table
Add an update hook to pathauto.install to install the new table and convert the data in key_value to rows in that table.
Update code the writes or reads to or from the pathauto_state.* collections to use the new table instead.

User interface changes

None.

Data model changes

New database table to store pathauto_state.* data.

✨ Feature request
Status

Active

Version

1.0

Component

Code

Created by

πŸ‡ΊπŸ‡ΈUnited States apotek

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

  • Issue created by @apotek
  • πŸ‡ΊπŸ‡ΈUnited States apotek
  • πŸ‡¨πŸ‡­Switzerland Berdir Switzerland

    I've been pondering whether using key value was the right decision, but the performance impact of this shouldn't be that big.

    key_value queries should be well optimized and pathauto state should be acessed only when editing/generating aliases.

    Also, have a look at πŸ“Œ Use cache collector for state Needs review , I've been working on that for years and I'm hopeful that it will finally land soon.

    This is a fairly big change with a possibly slow migration path, so it would need to be really worth doing so.

Production build 0.69.0 2024