Add an index on webform_submission_data name and property

Created on 13 October 2022, over 1 year ago
Updated 5 April 2024, 3 months ago

Problem/Motivation

In reviewing the mysql slow_query_log and the New Relic performance from a site I manage that makes pretty heavy use of Webform, we noticed these 2 queries that are problematic:

# Query 2: 0.04 QPS, 0.05x concurrency, ID 0xF1A2A4225360468875AEB5C47982AD2F at byte 80414
# This item is included in the report because it matches --limit.
# Scores: V/M = 0.48
# Time range: 2022-10-13T11:02:01 to 2022-10-13T14:07:22
# Attribute pct total min max avg 95% stddev median
# ============ === ======= ======= ======= ======= ======= ======= =======
# Count 77 441
# Exec time 33 602s 1s 14s 1s 2s 809ms 1s
# Lock time 0 24ms 45us 123us 54us 93us 13us 47us
# Rows sent 0 631 0 4 1.43 2.90 0.89 1.96
# Rows examine 0 1.86M 392 4.33k 4.31k 4.27k 198.16 4.27k
# Rows affecte 0 0 0 0 0 0 0 0
# Bytes sent 0 47.97k 97 132 111.39 118.34 8.81 112.70
# Merge passes 0 0 0 0 0 0 0 0
# Tmp tables 0 0 0 0 0 0 0 0
# Tmp disk tbl 0 0 0 0 0 0 0 0
# Tmp tbl size 0 0 0 0 0 0 0 0
# Query size 0 93.52k 217 224 217.16 212.52 1.50 212.52
# InnoDB:
# IO r bytes 3 390.78M 0 253.20M 907.39k 0 12.92M 0
# IO r ops 3 24.42k 0 15.83k 56.71 0 822.36 0
# IO r wait 3 19s 0 12s 43ms 0 621ms 0
# pages distin 83 3.22M 1.51k 7.53k 7.47k 7.31k 340.05 7.31k
# queue wait 0 0 0 0 0 0 0 0
# rec lock wai 0 0 0 0 0 0 0 0
# String:
# Databases somesiteomfn5mvk2z
# Hosts database.example.com
# InnoDB trxID 0
# Last errno 0
# Users s282844
# Query_time distribution
# 1us
# 10us
# 100us
# 1ms
# 10ms
# 100ms
# 1s ################################################################
# 10s+ #
# EXPLAIN /*!50100 PARTITIONS*/
SELECT "webform_submission_data"."sid" AS "sid"
FROM
"webform_submission_data" "webform_submission_data"
WHERE ("name" = 'my_name_value') AND ("value" = '123456') AND ("webform_id" = 'specific_webform_id')
ORDER BY "sid" ASC\G

# Query 3: 0.00 QPS, 0.02x concurrency, ID 0x497FA2CAF7977985028345411597D32C at byte 70081
# This item is included in the report because it matches --limit.
# Scores: V/M = 3.16
# Time range: 2022-10-13T11:01:54 to 2022-10-13T13:57:41
# Attribute pct total min max avg 95% stddev median
# ============ === ======= ======= ======= ======= ======= ======= =======
# Count 5 31
# Exec time 12 224s 4s 31s 7s 9s 5s 6s
# Lock time 0 2ms 49us 209us 74us 119us 35us 60us
# Rows sent 0 233 0 12 7.52 11.95 3.54 8.91
# Rows examine 63 124.57M 4.02M 4.02M 4.02M 3.86M 0 3.86M
# Rows affecte 0 0 0 0 0 0 0 0
# Bytes sent 0 4.54k 97 176 150 166.51 21.02 158.58
# Merge passes 0 0 0 0 0 0 0 0
# Tmp tables 0 0 0 0 0 0 0 0
# Tmp disk tbl 0 0 0 0 0 0 0 0
# Tmp tbl size 0 0 0 0 0 0 0 0
# Query size 0 5.48k 181 181 181 181 0 181
# InnoDB:
# IO r bytes 4 528.14M 0 528.14M 17.04M 0 89.78M 0
# IO r ops 4 33.01k 0 33.01k 1.06k 0 5.58k 0
# IO r wait 5 25s 0 25s 814ms 0 4s 0
# pages distin 6 247.55k 7.99k 7.99k 7.99k 7.99k 0 7.99k
# queue wait 0 0 0 0 0 0 0 0
# rec lock wai 0 0 0 0 0 0 0 0
# Boolean:
# Full scan 100% yes, 0% no
# String:
# Databases somesiteomfn5mvk2z
# Hosts database.example.com
# InnoDB trxID 0
# Last errno 0
# Users s282844
# Query_time distribution
# 1us
# 10us
# 100us
# 1ms
# 10ms
# 100ms
# 1s ################################################################
# 10s+ ####
# EXPLAIN /*!50100 PARTITIONS*/
SELECT "webform_submission_data"."sid" AS "sid"
FROM
"webform_submission_data" "webform_submission_data"
WHERE ("name" = 'my_name_value') AND ("value" = '123456')
ORDER BY "sid" ASC\G

There are many more queries for my situation with different values and the name and webform_id are just for example.

And here is an explain on those:


mysql> explain SELECT webform_submission_data.sid
    -> FROM
    -> webform_submission_data
    -> WHERE (name = 'my_name_value') AND (value = '123456') AND (webform_id = 'my_webform_id')
    -> ORDER BY sid;
+----+-------------+-------------------------+------------+------+---------------+------------+---------+-------+---------+----------+------------------------------------+
| id | select_type | table                   | partitions | type | possible_keys | key        | key_len | ref   | rows    | filtered | Extra                              |
+----+-------------+-------------------------+------------+------+---------------+------------+---------+-------+---------+----------+------------------------------------+
|  1 | SIMPLE      | webform_submission_data | NULL       | ref  | webform_id    | webform_id | 130     | const | 2199925 |     1.00 | Using index condition; Using where |
+----+-------------+-------------------------+------------+------+---------------+------------+---------+-------+---------+----------+------------------------------------+
1 row in set, 1 warning (0.00 sec)

mysql> explain SELECT webform_submission_data.sid AS sid FROM webform_submission_data webform_submission_data WHERE (name = 'my_name_value') AND (value = '123456') ORDER BY sid ASC;
+----+-------------+-------------------------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+
| id | select_type | table                   | partitions | type  | possible_keys | key     | key_len | ref  | rows    | filtered | Extra       |
+----+-------------+-------------------------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+
|  1 | SIMPLE      | webform_submission_data | NULL       | index | NULL          | PRIMARY | 1036    | NULL | 4399607 |     1.00 | Using where |
+----+-------------+-------------------------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+


And then I created this index which took a little less than a minute to create during the period of decently high load.

mysql> create index ix_webform_submission_data_name_property on webform_submission_data (name, property);

And here are the explain plans afterward:

mysql> explain SELECT webform_submission_data.sid FROM webform_submission_data WHERE (name = 'unique_id') AND (value = '094-000284') AND (webform_id = 'annual_owner') ORDER BY sid;
+----+-------------+-------------------------+------------+------+-----------------------------------------------------+------------------------------------------+---------+-------+-------+----------+----------------------------------------------------+
| id | select_type | table                   | partitions | type | possible_keys                                       | key                                      | key_len | ref   | rows  | filtered | Extra                                              |
+----+-------------+-------------------------+------------+------+-----------------------------------------------------+------------------------------------------+---------+-------+-------+----------+----------------------------------------------------+
|  1 | SIMPLE      | webform_submission_data | NULL       | ref  | webform_id,ix_webform_submission_data_name_property | ix_webform_submission_data_name_property | 514     | const | 26124 |     5.00 | Using index condition; Using where; Using filesort |
+----+-------------+-------------------------+------------+------+-----------------------------------------------------+------------------------------------------+---------+-------+-------+----------+----------------------------------------------------+
1 row in set, 1 warning (0.00 sec)


mysql> explain SELECT webform_submission_data.sid AS sid FROM webform_submission_data webform_submission_data WHERE (name = 'my_name_value') AND (value = '123456') ORDER BY sid ASC;
+----+-------------+-------------------------+------------+------+------------------------------------------+------------------------------------------+---------+-------+-------+----------+----------------------------------------------------+
| id | select_type | table                   | partitions | type | possible_keys                            | key                                      | key_len | ref   | rows  | filtered | Extra                                              |
+----+-------------+-------------------------+------------+------+------------------------------------------+------------------------------------------+---------+-------+-------+----------+----------------------------------------------------+
|  1 | SIMPLE      | webform_submission_data | NULL       | ref  | ix_webform_submission_data_name_property | ix_webform_submission_data_name_property | 514     | const | 26124 |    10.00 | Using index condition; Using where; Using filesort |
+----+-------------+-------------------------+------------+------+------------------------------------------+------------------------------------------+---------+-------+-------+----------+----------------------------------------------------+

So, the result is dramatically better on the select side. I guess that each new insert will take a small extra bit of time, so I can imagine that this solution is not necessarily worthwhile for the module as a whole.

Steps to reproduce

1. Have about 4,197,679 rows in webform_submission_data on a moderate database server.
2. Try querying it on name.

Proposed resolution

Add an index, maybe?

Remaining tasks

1. Decide if this is worthwhile for the module.
2. If so, adjust the schema and add an update hook.

User interface changes

n/a

API changes

n/a

Data model changes

Add an index on webform_submission_data name and property.

Feature request
Status

Needs review

Version

6.2

Component

Code

Created by

🇺🇸United States greggles Denver, Colorado, USA

Live updates comments and jobs are added and updated live.
  • Performance

    It affects performance. It is often combined with the Needs profiling tag.

Sign in to follow issues

Merge Requests

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

  • 🇺🇸United States jrockowitz Brooklyn, NY

    I am open to any performance improvements.

    The patch would add the index to new installations and provide an update hook for existing installations.

  • 🇺🇸United States greggles Denver, Colorado, USA

    Sorry I didn't followup on this. It did dramatically improve performance in our environment, particularly on a crucial operation, and it did not cause any noticeable problems with webform inserts.

  • I'm having performance issues with certain webform queries as well. The webform_submission_data table comprises nearly 24% of one database, taking up over 28M, though I'm sure greggles' is larger.

    I'm using webform views (see this performance issue 🐛 Reverse Entity Webform Submission Relationship is slow on MariaDB <= 10.5 Active ), and adding submission fields into the view means that the webform_submission_data table is joined once for every field.

  • I tried adding an index, and using ANALYZE on the query I'm testing shows that the number of rows improves dramatically.

    Before:

    • select_type: SIMPLE
    • table: webform_submission_data
    • type: ALL
    • possible_keys: webform_id
    • key: \N
    • key_len: \N
    • ref: \N
    • rows: 159725
    • r_rows: 154624
    • filtered: 15.86
    • r_filtered: 0
    • Extra: Using where; Using join buffer (flat, BNL join)

    Afterward:

    • select_type: SIMPLE
    • table: webform_submission_data
    • type: ref
    • possible_keys: webform_id,ix_webform_submission_data_name_property
    • key: ix_webform_submission_data_name_property
    • key_len: 514
    • ref: const
    • rows: 5214
    • r_rows: 5214.00
    • filtered: 15.86
    • r_filtered: 0.03
    • Extra: Using where

    Adding the index improves the query execution time quite a bit as well.

  • I would also like to mention that if we add a patch/MR, it maybe should check if the index already exists, because someone might have created it like greggles did.

  • Open on Drupal.org →
    Core: 10.1.4 + Environment: PHP 8.2 & MySQL 8
    last update 3 months ago
    Waiting for branch to pass
  • Status changed to Needs review 3 months ago
  • I'm new to adding indexed and how Drupal does this. I copied the examples I could find in existing code. Please review.

    Note that I didn't use the same index name that was suggested in the workaround, which used

    create index ix_webform_submission_data_name_property on webform_submission_data (`name`, `property`);
    

    The new index is name_property and is essentially added with the Drupal DB abstraction layer version of:

    ALTER TABLE webform_submission_data ADD INDEX name_property (`name`, `property`);
    

    You can see existing indexes on the table with SHOW INDEXES FROM webform_submission_data;

    You can drop the workaround index with DROP INDEX ix_webform_submission_data_name_property ON webform_submission_data;

  • Pipeline finished with Failed
    3 months ago
    Total: 2427s
    #124618
  • Open in Jenkins → Open on Drupal.org →
    Core: 10.2.x + Environment: PHP 8.1 & MySQL 8
    last update 3 months ago
    506 pass, 48 fail
  • Open on Drupal.org →
    Core: 10.1.4 + Environment: PHP 8.2 & MySQL 8
    last update 3 months ago
    Waiting for branch to pass
  • I changed the hook to webform_update_9600, because it wasn't running. The module currently supports a minimum of Drupal 9.4, and the module version is 6.2.x, so N is 9600.

  • Pipeline finished with Failed
    3 months ago
    Total: 2525s
    #125441
  • I tried including delta in the index as well, but I didn't see any difference/benefit. Does anyone think that should be included, or is the MR good as-is?

  • As a note, I think documentation should be added to say that the following could greatly improve SQL query performance.

    Put in my.cnf on the DB server, or in .ddev/mysql/my.cnf

    [server]
    # Greatly improves performance of some SQL queries.
    optimizer_search_depth = 0
    
  • Status changed to Needs work 3 months ago
  • After applying the update, the site's status report has an error:

    Mismatched entity and/or field definitions

    The following changes were detected in the entity type and field definitions.

    Webform submission

    • The Webform submission entity type needs to be updated.
  • Assigned to solideogloria
  • To fix it after having applied the (not quite correct) patch, run drush webform:repair

    I will apply a fix to the MR soon. It's an easy fix.

  • Open on Drupal.org →
    Core: 10.1.4 + Environment: PHP 8.2 & MySQL 8
    last update 3 months ago
    Waiting for branch to pass
  • Issue was unassigned.
  • Status changed to Needs review 3 months ago
  • Pipeline finished with Failed
    3 months ago
    Total: 2531s
    #136480
  • I would like to note that in MySQL and MariaDB, setting optimizer_search_depth=0 also has a huge performance boost.

Production build 0.69.0 2024