- Issue created by @hfernandes
- Merge request !538Issue #3484204: Adds new index to the 'webform_id' and 'name' in the... β (Open) created by hfernandes
- πΊπΈUnited States jrockowitz Brooklyn, NY
I would lean towards committing β¨ [Performance] Add an index on webform_submission_data name and property Needs review and then make this performance improvement.
- π§π·Brazil hfernandes
@jrockowitz, would it make sense to merge both developments into a single MR? The β¨ [Performance] Add an index on webform_submission_data name and property Needs review issue introduces an index on the 'name' and 'property' values, while this MR adds an index on 'name' and 'webform_id.'.
- πΊπΈUnited States jrockowitz Brooklyn, NY
It is very possible the creating the index proposed in β¨ [Performance] Add an index on webform_submission_data name and property Needs review will address this issue. My understanding is the
name
part of thename,property
will be used by the delete queryI am not sure removing the SQL IN condition will be a significant improvement.
@see https://stackoverflow.com/questions/31500/do-indexes-work-with-in-clause - Issue was unassigned.
- Status changed to Needs review
7 days ago 3:05pm 15 January 2025 - π§π·Brazil hfernandes
Hi @jrockowitz,
Thank you for your suggestion regarding the index on
name,property
. However, I believe that a composite index including bothwebform_id
andname
would provide better performance for the query in question. Hereβs my reasoning:Query Condition Analysis:
Based on the query you referenced, theWHERE
clause includes bothwebform_id
andname
. This means that both columns are involved in filtering the results.
While an index onname,property
may help in some scenarios, it does not fully cover thewebform_id
filter condition.
A composite index onwebform_id
andname
allows the database engine to use both columns together for filtering. This reduces the need for additional scans and increases the speed of the query, especially in large datasets wherewebform_id
is often a primary filter.
Without this composite index, the database would either:- Perform a less efficient scan using a partial index (e.g., on
name
orproperty
alone). - Fall back to a full table scan if no suitable index exists.
Performance Testing Evidence:
To demonstrate the impact, I tested the query on a sample dataset. Below are the results:Query without a composite index:
Query with a composite index on
name
andproperty
:
Query with a composite index on
webform_id
andname
:
These tests show a clear improvement in query performance when using the composite index on
webform_id
andname
and no real impact - the performance got worse - when using a composite index onname
andproperty
.Database Optimization Best Practices:
Itβs a best practice to align indexes with the columns used in theWHERE
clause to minimize row scans.
Adding the composite index onwebform_id
andname
would allow the query planner to leverage it effectively and deliver faster results.I also agree that the use of the
IN
clause in this case is unlikely to cause a significant impact on query performance. However, Iβd like to point out that the post you referenced pertains to SQL Server, which uses a different query engine than MySQL. While some general indexing principles may overlap, itβs important to account for differences in how these databases optimize queries and use indexes. In MySQL, aligning indexes with the columns in theWHERE
clause is a crucial factor in achieving optimal performance on large datasets. - Perform a less efficient scan using a partial index (e.g., on