- Issue created by @hfernandes
- Merge request !538Issue #3484204: Adds new index to the 'webform_id' and 'name' in the... โ (Merged) created by hfernandes
- ๐บ๐ธUnited States jrockowitz Brooklyn, NY
I would lean towards committing โจ [Performance] Add an index on webform_submission_data name and property Needs review and then make this performance improvement.
- ๐ง๐ทBrazil hfernandes
@jrockowitz, would it make sense to merge both developments into a single MR? The โจ [Performance] Add an index on webform_submission_data name and property Needs review issue introduces an index on the 'name' and 'property' values, while this MR adds an index on 'name' and 'webform_id.'.
- ๐บ๐ธUnited States jrockowitz Brooklyn, NY
It is very possible the creating the index proposed in โจ [Performance] Add an index on webform_submission_data name and property Needs review will address this issue. My understanding is the
namepart of thename,propertywill be used by the delete queryI am not sure removing the SQL IN condition will be a significant improvement.
@see https://stackoverflow.com/questions/31500/do-indexes-work-with-in-clause - Issue was unassigned.
- Status changed to Needs review
about 1 year ago 3:05pm 15 January 2025 - ๐ง๐ทBrazil hfernandes
Hi @jrockowitz,
Thank you for your suggestion regarding the index on
name,property. However, I believe that a composite index including bothwebform_idandnamewould provide better performance for the query in question. Hereโs my reasoning:Query Condition Analysis:
Based on the query you referenced, theWHEREclause includes bothwebform_idandname. This means that both columns are involved in filtering the results.
While an index onname,propertymay help in some scenarios, it does not fully cover thewebform_idfilter condition.
A composite index onwebform_idandnameallows the database engine to use both columns together for filtering. This reduces the need for additional scans and increases the speed of the query, especially in large datasets wherewebform_idis often a primary filter.
Without this composite index, the database would either:- Perform a less efficient scan using a partial index (e.g., on
nameorpropertyalone). - Fall back to a full table scan if no suitable index exists.
Performance Testing Evidence:
To demonstrate the impact, I tested the query on a sample dataset. Below are the results:Query without a composite index:
Query with a composite index on
nameandproperty:
Query with a composite index on
webform_idandname:
These tests show a clear improvement in query performance when using the composite index on
webform_idandnameand no real impact - the performance got worse - when using a composite index onnameandproperty.Database Optimization Best Practices:
Itโs a best practice to align indexes with the columns used in theWHEREclause to minimize row scans.
Adding the composite index onwebform_idandnamewould allow the query planner to leverage it effectively and deliver faster results.I also agree that the use of the
INclause in this case is unlikely to cause a significant impact on query performance. However, Iโd like to point out that the post you referenced pertains to SQL Server, which uses a different query engine than MySQL. While some general indexing principles may overlap, itโs important to account for differences in how these databases optimize queries and use indexes. In MySQL, aligning indexes with the columns in theWHEREclause is a crucial factor in achieving optimal performance on large datasets. - Perform a less efficient scan using a partial index (e.g., on
- ๐บ๐ธUnited States jrockowitz Brooklyn, NY
Yep I think we need to add the index and this MR seems to be the simplest solution