Problem/Motivation
In reviewing the mysql slow_query_log and the New Relic performance from a site I manage that makes pretty heavy use of Webform, we noticed these 2 queries that are problematic:
# Query 2: 0.04 QPS, 0.05x concurrency, ID 0xF1A2A4225360468875AEB5C47982AD2F at byte 80414
# This item is included in the report because it matches --limit.
# Scores: V/M = 0.48
# Time range: 2022-10-13T11:02:01 to 2022-10-13T14:07:22
# Attribute pct total min max avg 95% stddev median
# ============ === ======= ======= ======= ======= ======= ======= =======
# Count 77 441
# Exec time 33 602s 1s 14s 1s 2s 809ms 1s
# Lock time 0 24ms 45us 123us 54us 93us 13us 47us
# Rows sent 0 631 0 4 1.43 2.90 0.89 1.96
# Rows examine 0 1.86M 392 4.33k 4.31k 4.27k 198.16 4.27k
# Rows affecte 0 0 0 0 0 0 0 0
# Bytes sent 0 47.97k 97 132 111.39 118.34 8.81 112.70
# Merge passes 0 0 0 0 0 0 0 0
# Tmp tables 0 0 0 0 0 0 0 0
# Tmp disk tbl 0 0 0 0 0 0 0 0
# Tmp tbl size 0 0 0 0 0 0 0 0
# Query size 0 93.52k 217 224 217.16 212.52 1.50 212.52
# InnoDB:
# IO r bytes 3 390.78M 0 253.20M 907.39k 0 12.92M 0
# IO r ops 3 24.42k 0 15.83k 56.71 0 822.36 0
# IO r wait 3 19s 0 12s 43ms 0 621ms 0
# pages distin 83 3.22M 1.51k 7.53k 7.47k 7.31k 340.05 7.31k
# queue wait 0 0 0 0 0 0 0 0
# rec lock wai 0 0 0 0 0 0 0 0
# String:
# Databases somesiteomfn5mvk2z
# Hosts database.example.com
# InnoDB trxID 0
# Last errno 0
# Users s282844
# Query_time distribution
# 1us
# 10us
# 100us
# 1ms
# 10ms
# 100ms
# 1s ################################################################
# 10s+ #
# EXPLAIN /*!50100 PARTITIONS*/
SELECT "webform_submission_data"."sid" AS "sid"
FROM
"webform_submission_data" "webform_submission_data"
WHERE ("name" = 'my_name_value') AND ("value" = '123456') AND ("webform_id" = 'specific_webform_id')
ORDER BY "sid" ASC\G
# Query 3: 0.00 QPS, 0.02x concurrency, ID 0x497FA2CAF7977985028345411597D32C at byte 70081
# This item is included in the report because it matches --limit.
# Scores: V/M = 3.16
# Time range: 2022-10-13T11:01:54 to 2022-10-13T13:57:41
# Attribute pct total min max avg 95% stddev median
# ============ === ======= ======= ======= ======= ======= ======= =======
# Count 5 31
# Exec time 12 224s 4s 31s 7s 9s 5s 6s
# Lock time 0 2ms 49us 209us 74us 119us 35us 60us
# Rows sent 0 233 0 12 7.52 11.95 3.54 8.91
# Rows examine 63 124.57M 4.02M 4.02M 4.02M 3.86M 0 3.86M
# Rows affecte 0 0 0 0 0 0 0 0
# Bytes sent 0 4.54k 97 176 150 166.51 21.02 158.58
# Merge passes 0 0 0 0 0 0 0 0
# Tmp tables 0 0 0 0 0 0 0 0
# Tmp disk tbl 0 0 0 0 0 0 0 0
# Tmp tbl size 0 0 0 0 0 0 0 0
# Query size 0 5.48k 181 181 181 181 0 181
# InnoDB:
# IO r bytes 4 528.14M 0 528.14M 17.04M 0 89.78M 0
# IO r ops 4 33.01k 0 33.01k 1.06k 0 5.58k 0
# IO r wait 5 25s 0 25s 814ms 0 4s 0
# pages distin 6 247.55k 7.99k 7.99k 7.99k 7.99k 0 7.99k
# queue wait 0 0 0 0 0 0 0 0
# rec lock wai 0 0 0 0 0 0 0 0
# Boolean:
# Full scan 100% yes, 0% no
# String:
# Databases somesiteomfn5mvk2z
# Hosts database.example.com
# InnoDB trxID 0
# Last errno 0
# Users s282844
# Query_time distribution
# 1us
# 10us
# 100us
# 1ms
# 10ms
# 100ms
# 1s ################################################################
# 10s+ ####
# EXPLAIN /*!50100 PARTITIONS*/
SELECT "webform_submission_data"."sid" AS "sid"
FROM
"webform_submission_data" "webform_submission_data"
WHERE ("name" = 'my_name_value') AND ("value" = '123456')
ORDER BY "sid" ASC\G
There are many more queries for my situation with different values and the name and webform_id are just for example.
And here is an explain on those:
mysql> explain SELECT webform_submission_data.sid
-> FROM
-> webform_submission_data
-> WHERE (name = 'my_name_value') AND (value = '123456') AND (webform_id = 'my_webform_id')
-> ORDER BY sid;
+----+-------------+-------------------------+------------+------+---------------+------------+---------+-------+---------+----------+------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------------------------+------------+------+---------------+------------+---------+-------+---------+----------+------------------------------------+
| 1 | SIMPLE | webform_submission_data | NULL | ref | webform_id | webform_id | 130 | const | 2199925 | 1.00 | Using index condition; Using where |
+----+-------------+-------------------------+------------+------+---------------+------------+---------+-------+---------+----------+------------------------------------+
1 row in set, 1 warning (0.00 sec)
mysql> explain SELECT webform_submission_data.sid AS sid FROM webform_submission_data webform_submission_data WHERE (name = 'my_name_value') AND (value = '123456') ORDER BY sid ASC;
+----+-------------+-------------------------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------------------------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+
| 1 | SIMPLE | webform_submission_data | NULL | index | NULL | PRIMARY | 1036 | NULL | 4399607 | 1.00 | Using where |
+----+-------------+-------------------------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+
And then I created this index which took a little less than a minute to create during the period of decently high load.
mysql> create index ix_webform_submission_data_name_property on webform_submission_data (name, property);
And here are the explain plans afterward:
mysql> explain SELECT webform_submission_data.sid FROM webform_submission_data WHERE (name = 'unique_id') AND (value = '094-000284') AND (webform_id = 'annual_owner') ORDER BY sid;
+----+-------------+-------------------------+------------+------+-----------------------------------------------------+------------------------------------------+---------+-------+-------+----------+----------------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------------------------+------------+------+-----------------------------------------------------+------------------------------------------+---------+-------+-------+----------+----------------------------------------------------+
| 1 | SIMPLE | webform_submission_data | NULL | ref | webform_id,ix_webform_submission_data_name_property | ix_webform_submission_data_name_property | 514 | const | 26124 | 5.00 | Using index condition; Using where; Using filesort |
+----+-------------+-------------------------+------------+------+-----------------------------------------------------+------------------------------------------+---------+-------+-------+----------+----------------------------------------------------+
1 row in set, 1 warning (0.00 sec)
mysql> explain SELECT webform_submission_data.sid AS sid FROM webform_submission_data webform_submission_data WHERE (name = 'my_name_value') AND (value = '123456') ORDER BY sid ASC;
+----+-------------+-------------------------+------------+------+------------------------------------------+------------------------------------------+---------+-------+-------+----------+----------------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------------------------+------------+------+------------------------------------------+------------------------------------------+---------+-------+-------+----------+----------------------------------------------------+
| 1 | SIMPLE | webform_submission_data | NULL | ref | ix_webform_submission_data_name_property | ix_webform_submission_data_name_property | 514 | const | 26124 | 10.00 | Using index condition; Using where; Using filesort |
+----+-------------+-------------------------+------------+------+------------------------------------------+------------------------------------------+---------+-------+-------+----------+----------------------------------------------------+
So, the result is dramatically better on the select side. I guess that each new insert will take a small extra bit of time, so I can imagine that this solution is not necessarily worthwhile for the module as a whole.
Steps to reproduce
1. Have about 4,197,679 rows in webform_submission_data on a moderate database server.
2. Try querying it on name.
Proposed resolution
Add an index, maybe?
Remaining tasks
1. Decide if this is worthwhile for the module.
2. If so, adjust the schema and add an update hook.
User interface changes
n/a
API changes
n/a
Data model changes
Add an index on webform_submission_data name and property.