Configuration management performance regression - slow config:import

Created on 8 June 2021, over 3 years ago
Updated 28 October 2024, about 1 month ago

Problem/Motivation

Drupal 8.8 introduced an API (see #2991683 ) to manage configuration transformation by allowing event subscribers to edit configuration before import or export. It works by loading the new configuration in a storage object and emitting an event to allow modules to modify the config before import. See ImportStorageTransformer.php.

Importing configuration can be a costly effort in terms of number of database queries, but usually there are no configuration differences, resulting in fast configuration management commands like config:status and config:import. However, since the new API was introduced, the complete site configuration is written to the database for each invocation of the config:status or config:import command, even if there are no changes.

In optimistic scenario's where the amount of configuration is small or the database is optimized for write speed (instead of data consistency) this has a minor but still noticeable impact on config sync operations. In other scenario's, even for sites with a small amount of configuration, config:import takes more than 10 times longer to complete compared to Drupal before the new API was introduced.

The cause of this performance regression is using a database storage backend to temporarily store the complete site configuration. The problem is aggravated by not using transactions causing databases optimized for data consistency (like InnoDB with innodb_flush_log_at_trx_commit set to a value lower than 2, which is the default for MySQL and MariaDB) to consider each write query a separate transaction.

<!--break-->

Steps to reproduce

The problem can easily be observed by comparing the time the complete config:status or config:import before and after Drupal 8.8. Another way to observe the difference is by manually changing the backend used in ImportStorageTransformer. In the transform() method (see attached patch), change:

$mutable = new DatabaseStorage($this->connection, 'config_import');

into:

$mutable = new MemoryStorage()

The smallest improvement I've seen with this change is 20% (for example, the operation taking 7 instead of 9 seconds), the largest improvement was 95%:

DatabaseStorage:

$ time drush config:status
[notice] No differences between DB and sync directory.

real    1m0.115s
user    0m2.469s
sys     0m0.605s

MemoryStorage:

$ time drush config:status
[notice] No differences between DB and sync directory.

real    0m2.608s
user    0m1.954s
sys     0m0.360s

Proposed resolution

An obvious solution would be to use the MemoryStorage instead of DatabaseStorage. The comment in ImportStorageTransformer however reads:

We use a database storage to reduce the memory requirement.

Which would mean this problem requires a different solution. We could wrap the write queries in a single transaction:

$transaction = $this->connection->startTransaction();

// Copy the sync configuration to the created mutable storage.
self::replaceStorageContents($storage, $mutable);

$this->connection->popTransaction($transaction->name());

In my test cases, wrapping the queries in a transaction results in almost the same performance gain as using the MemoryStorage. This might be different in other setups.

I managed to find one comment where the use of DatabaseStorage instead of MemoryStorage was discussed. The comment mentions "concerns with memory usage". I can't talk for the many people using Drupal in different ways, but in my situation - a Drupal site not very big, not small either - with 7MB of yaml config, I do not notice any difference in peak memory consumption.

Remaining tasks

  • Discuss: should the transformer use a memory or database backend?
  • Should the backend perhaps be configurable?
  • If database storage is to stay: wrap in transaction

I'd also like to point out the StorageCopyTrait::replaceStorageContents() method has a similar issue. It is used in two places:

  • ExportStorageManager: export would be faster if a MemoryStorage or FileStorage is used
  • ConfigSnapshotSubscriber: creating a full config snapshot in the database after each import (when at least one item is changed) - could someone explain the purpose? Why would someone want this? Snapshotting would also benefit from being wrapped in a transaction.

Changing the ExportStorageManager to use the memory storage or implement transactions seems like a good idea. Doing the same for the snapshot subscriber might defeat its purpose. Though making it optional might be a valid use-case. I consider those separate issues.

User interface changes

None.

API changes

None, the API itself is fine.

Data model changes

None.

🐛 Bug report
Status

Needs work

Version

11.0 🔥

Component

configuration system

Created by

🇳🇱Netherlands jsst

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

Production build 0.71.5 2024