- 🇨🇳China lawxen
#42 🐛 Fix IndexTidDepth views argument plugin and TaxonomyIndexTidDepth views filter plugin performance Fixed works for Drupal8.9,
And It brings a tenfold increase in speed:
This issue is to propose two significant optimizations to \Drupal\taxonomy\Plugin\views\argument\IndexTidDepth
and \Drupal\taxonomy\Plugin\views\filter\TaxonomyIndexTidDepth
.
WHERE ... IN ( subselect )
clause, which is inefficient on MySQL if the subselect returns anything but a trivially small result setHere is the complete query HEAD generates for a single TID query with a depth of 3.
SELECT DISTINCT node_field_data.created AS node_field_data_created, node_field_data.nid AS nid FROM node_field_data node_field_data LEFT JOIN taxonomy_index taxonomy_index ON node_field_data.nid = taxonomy_index.nid LEFT JOIN node__field_promote_states node__field_promote_states ON node_field_data.nid = node__field_promote_states.entity_id AND node__field_promote_states.field_promote_states_value = 'landing' WHERE (((node__field_promote_states.field_promote_states_value IS NULL)) AND (node_field_data.nid IN (SELECT tn.nid AS nid FROM taxonomy_index tn LEFT OUTER JOIN taxonomy_term__parent th ON th.entity_id = tn.tid LEFT OUTER JOIN taxonomy_term__parent th1 ON th.parent_target_id = th1.entity_id WHERE (tn.tid = '353') OR (th1.entity_id = '353')))) AND ((taxonomy_index.status = '1') AND (node_field_data.status = '1') AND (node_field_data.type IN ('article', 'recipe'))) ORDER BY node_field_data_created DESC LIMIT 6 OFFSET 5;
+-------------------------+--------+
| node_field_data_created | nid |
+-------------------------+--------+
| 1594378800 | 110686 |
| 1594360800 | 110446 |
| 1593946800 | 110321 |
| 1593342000 | 109946 |
| 1593241200 | 109896 |
| 1592823600 | 109591 |
+-------------------------+--------+
6 rows in set (0.49 sec)
The patch changes the code in two aspects:
The existing code generates subselect following this pattern (tid 353, +3 levels):
explain SELECT tn.nid AS nid FROM taxonomy_index tn LEFT OUTER JOIN taxonomy_term__parent th ON th.entity_id = tn.tid LEFT OUTER JOIN taxonomy_term__parent th1 ON th.parent_target_id = th1.entity_id LEFT OUTER JOIN taxonomy_term__parent th2 ON th1.parent_target_id = th2.entity_id WHERE (tn.tid = '353') OR (th1.entity_id = '353') OR (th2.entity_id = '353');
+----+-------------+-------+------------+-------+---------------+-----------+---------+------------------------------+--------+----------+--------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+-----------+---------+------------------------------+--------+----------+--------------------------+
| 1 | SIMPLE | tn | NULL | index | term_node | term_node | 14 | NULL | 124367 | 100.00 | Using index |
| 1 | SIMPLE | th | NULL | ref | PRIMARY | PRIMARY | 4 | instyle.tn.tid | 1 | 100.00 | NULL |
| 1 | SIMPLE | th1 | NULL | ref | PRIMARY | PRIMARY | 4 | instyle.th.parent_target_id | 1 | 100.00 | NULL |
| 1 | SIMPLE | th2 | NULL | ref | PRIMARY | PRIMARY | 4 | instyle.th1.parent_target_id | 1 | 100.00 | Using where; Using index |
+----+-------------+-------+------------+-------+---------------+-----------+---------+------------------------------+--------+----------+--------------------------+
The code in head has to scan all of the taxonomy_index rows.
The patch generates much more efficient code using a UNION of specific inner joins, the replacement for the example above is:
explain SELECT tn.nid AS nid FROM taxonomy_index tn WHERE tn.tid = '353' UNION SELECT tn.nid AS nid FROM taxonomy_index tn INNER JOIN taxonomy_term__parent th ON tn.tid = th.entity_id INNER JOIN taxonomy_term__parent th1 ON th.parent_target_id = th1.entity_id WHERE th1.entity_id = '353' UNION SELECT tn.nid AS nid FROM taxonomy_index tn INNER JOIN taxonomy_term__parent th ON tn.tid = th.entity_id INNER JOIN taxonomy_term__parent th1 ON th.parent_target_id = th1.entity_id INNER JOIN taxonomy_term__parent th2 ON th1.parent_target_id = th2.entity_id WHERE th2.entity_id = '353';
+----+--------------+--------------+------------+------+--------------------------+------------------+---------+-----------------------+------+----------+--------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+--------------+--------------+------------+------+--------------------------+------------------+---------+-----------------------+------+----------+--------------------------+
| 1 | PRIMARY | tn | NULL | ref | PRIMARY,term_node,bundle | term_node | 4 | const | 102 | 100.00 | Using index |
| 2 | UNION | th1 | NULL | ref | PRIMARY | PRIMARY | 4 | const | 1 | 100.00 | Using index |
| 2 | UNION | th | NULL | ref | PRIMARY,parent_target_id | parent_target_id | 4 | const | 5 | 100.00 | Using where; Using index |
| 2 | UNION | tn | NULL | ref | term_node | term_node | 4 | instyle.th.entity_id | 17 | 100.00 | Using index |
| 3 | UNION | th2 | NULL | ref | PRIMARY | PRIMARY | 4 | const | 1 | 100.00 | Using index |
| 3 | UNION | th1 | NULL | ref | PRIMARY,parent_target_id | parent_target_id | 4 | const | 5 | 100.00 | Using where; Using index |
| 3 | UNION | th | NULL | ref | PRIMARY,parent_target_id | parent_target_id | 4 | instyle.th1.entity_id | 223 | 100.00 | Using index |
| 3 | UNION | tn | NULL | ref | term_node | term_node | 4 | instyle.th.entity_id | 17 | 100.00 | Using index |
| NULL | UNION RESULT | <union1,2,3> | NULL | ALL | NULL | NULL | NULL | NULL | NULL | NULL | Using temporary |
+----+--------------+--------------+------------+------+--------------------------+------------------+---------+-----------------------+------+----------+--------------------------+
Far fewer rows are scanned and indexes are used.
if the outer query constructed by views is inefficient (as can easily be the case with any non-trivial LEFT-join), the query as a whole will still remain inefficient because the efficient subselect is used in a WHERE IN (...) which MySQL only uses as a filter on the outer query result.
This can be made much more efficient by using an INNER join. Here's an example for how the patch will change the SQL (reformatted for readability again)
SELECT DISTINCT node_field_data.created AS node_field_data_created, node_field_data.nid AS nid FROM node_field_data node_field_data LEFT JOIN taxonomy_index taxonomy_index ON node_field_data.nid = taxonomy_index.nid LEFT JOIN node__field_promote_states node__field_promote_states ON node_field_data.nid = node__field_promote_states.entity_id AND node__field_promote_states.field_promote_states_value = 'landing' WHERE (((node__field_promote_states.field_promote_states_value IS NULL)) AND (node_field_data.nid IN (SELECT tn.nid AS nid FROM taxonomy_index tn WHERE tn.tid = '353' UNION SELECT tn.nid AS nid FROM taxonomy_index tn INNER JOIN taxonomy_term__parent th ON tn.tid = th.entity_id INNER JOIN taxonomy_term__parent th1 ON th.parent_target_id = th1.entity_id WHERE th1.entity_id = '353'))) AND ((taxonomy_index.status = '1') AND (node_field_data.status = '1') AND (node_field_data.type IN ('article', 'recipe'))) ORDER BY node_field_data_created DESC LIMIT 6 OFFSET 5;
+-------------------------+--------+
| node_field_data_created | nid |
+-------------------------+--------+
| 1594378800 | 110686 |
| 1594360800 | 110446 |
| 1593946800 | 110321 |
| 1593342000 | 109946 |
| 1593241200 | 109896 |
| 1592823600 | 109591 |
+-------------------------+--------+
6 rows in set (0.08 sec)
SELECT DISTINCT node_field_data.created AS node_field_data_created, node_field_data.nid AS nid FROM node_field_data node_field_data LEFT JOIN taxonomy_index taxonomy_index ON node_field_data.nid = taxonomy_index.nid LEFT JOIN node__field_promote_states node__field_promote_states ON node_field_data.nid = node__field_promote_states.entity_id AND node__field_promote_states.field_promote_states_value = 'landing' INNER JOIN (SELECT tn.nid AS nid FROM taxonomy_index tn WHERE tn.tid = '353' UNION SELECT tn.nid AS nid FROM taxonomy_index tn INNER JOIN taxonomy_term__parent th ON tn.tid = th.entity_id INNER JOIN taxonomy_term__parent th1 ON th.parent_target_id = th1.entity_id WHERE th1.entity_id = '353') node_taxonomy_depth ON node_taxonomy_depth.nid = node_field_data.nid WHERE ((node__field_promote_states.field_promote_states_value IS NULL)) AND ((taxonomy_index.status = '1') AND (node_field_data.status = '1') AND (node_field_data.type IN ('article', 'recipe'))) ORDER BY node_field_data_created DESC LIMIT 6 OFFSET 5;
+-------------------------+--------+
| node_field_data_created | nid |
+-------------------------+--------+
| 1594378800 | 110686 |
| 1594360800 | 110446 |
| 1593946800 | 110321 |
| 1593342000 | 109946 |
| 1593241200 | 109896 |
| 1592823600 | 109591 |
+-------------------------+--------+
6 rows in set (0.01 sec)
EXPLAIN is not so useful for this one because the join via query makes for complicated reading - but you can clearly see the performance improvement when running the query.
Use MySQL profiling we can compare the original query versus the final query...
+----------+------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Query_ID | Duration | Query |
+----------+------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 1 | 0.49241525 | SELECT DISTINCT node_field_data.created AS node_field_data_created, node_field_data.nid AS nid FROM node_field_data node_field_data LEFT JOIN taxonomy_index taxonomy_index ON node_field_data.nid = taxonomy_index.nid LEFT JOIN node__field_promote_states node__field_promote_states ON node_field_data.n |
| 2 | 0.00789575 | SELECT DISTINCT node_field_data.created AS node_field_data_created, node_field_data.nid AS nid
FROM
node_field_data node_field_data
LEFT JOIN taxonomy_index taxonomy_index ON node_field_data.nid = taxonomy_index.nid
LEFT JOIN node__field_promote_states node__field_promote_states ON node_field_data.n |
+----------+------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
This shows that original query took 0.49241525 seconds to execute and the new query takes 0.00789575 seconds. This is 98% decrease in the time take to run the query!
node_field_data contains 34299 rows
taxonomy_index contains 129814 rows
taxonomy_term__parent contains 6712 rows
node__field_promote_states contains 96159 rows
N/a
N/a
N/a
Views that use either the taxonomy depth filter or taxonomy depth argument are massively more performant.
Fixed
9.3
taxonomy.module
It affects performance. It is often combined with the Needs profiling tag.
Not all content is available!
It's likely this issue predates Contrib.social: some issue and comment data are missing.
#42
🐛
Fix IndexTidDepth views argument plugin and TaxonomyIndexTidDepth views filter plugin performance
Fixed
works for Drupal8.9,
And It brings a tenfold increase in speed: