Problem/Motivation
Quite a long time ago, before Drupal 8.0, we made the decision to make Views more intelligent about which entity table to select from by default. Before, we always started of the base table and then almost always joined the data table if it exists, because that's where all the data is (except the UUID).
Unfortunately, we never updated entity query accordingly and it is actually far worse there:
* It always picks the base table first, that was the same as views
* It always adds a join as soon as you start to add conditions. *even* if that happens to be the UUID, then it just joins itself or an entity type without a data table.
* It always joins other entity types through the base table and then adds yet another join even if they have no data table. Because reasons.
Proposed resolution
This is an attempt at making things more sane and possibly considerably more performant.
Just like views, it starts of at the data table if possible, I'm also avoiding the duplicated join by initializing the entity tables (really, joins) mapping table. And I'm trying to improve actual joining as well by skipping the initial join as well.
I've only tested it with a simple example queries, didn't even run the tests yet, I just wanted to start this before I forget about this again (Had the idea of doing this probably a few times already).
My test script:
echo "\nQuery 1:\n";
$ids = \Drupal::entityQuery('node')
->execute();
print_r($ids);
echo "\nQuery 2:\n";
$ids = \Drupal::entityQuery('node')
->condition('title', 'First node')
->execute();
print_r($ids);
echo "\nQuery 3:\n";
$ids = \Drupal::entityQuery('node')
->condition('uuid', '20478baa-64e4-4b01-bf68-5ea34e3db78b')
->execute();
print_r($ids);
echo "\nQuery 4:\n";
$ids = \Drupal::entityQuery('node')
->condition('uid.entity.uid', 1)
->execute();
print_r($ids);
(conditions only work on my specific nodes of course). I also added a 'echo $this->sqlQuery . "\n";' so I can see the generated SQL query.
HEAD:
Query 1:
SELECT base_table.vid AS vid, base_table.nid AS nid
FROM
{node} base_table
Array
(
[1] => 1
[2] => 2
[3] => 3
[4] => 4
)
Query 2:
SELECT base_table.vid AS vid, base_table.nid AS nid
FROM
{node} base_table
INNER JOIN {node_field_data} node_field_data ON node_field_data.nid = base_table.nid
WHERE node_field_data.title LIKE :db_condition_placeholder_0 ESCAPE '\\'
Array
(
[3] => 3
)
Query 3:
SELECT base_table.vid AS vid, base_table.nid AS nid
FROM
{node} base_table
INNER JOIN {node} node ON node.nid = base_table.nid
WHERE node.uuid LIKE :db_condition_placeholder_0 ESCAPE '\\'
Array
(
[1] => 1
)
Query 4:
SELECT base_table.vid AS vid, base_table.nid AS nid
FROM
{node} base_table
INNER JOIN {node_field_data} node_field_data ON node_field_data.nid = base_table.nid
LEFT OUTER JOIN {users} users ON users.uid = node_field_data.uid
INNER JOIN {users_field_data} users_field_data ON users_field_data.uid = users.uid
WHERE users_field_data.uid = :db_condition_placeholder_0
Array
(
[4] => 4
)
With my patch.
Query 1:
SELECT base_table.vid AS vid, base_table.nid AS nid
FROM
{node_field_data} base_table
Array
(
[1] => 1
[2] => 2
[3] => 3
[4] => 4
)
Query 2:
SELECT base_table.vid AS vid, base_table.nid AS nid
FROM
{node_field_data} base_table
WHERE base_table.title LIKE :db_condition_placeholder_0 ESCAPE '\\'
Array
(
[3] => 3
)
Query 3:
SELECT base_table.vid AS vid, base_table.nid AS nid
FROM
{node_field_data} base_table
INNER JOIN {node} node ON node.nid = base_table.nid
WHERE node.uuid LIKE :db_condition_placeholder_0 ESCAPE '\\'
Array
(
[1] => 1
)
Query 4:
SELECT base_table.vid AS vid, base_table.nid AS nid
FROM
{node_field_data} base_table
INNER JOIN {users_field_data} users_field_data ON users_field_data.uid = base_table.uid
WHERE users_field_data.uid = :db_condition_placeholder_0
Array
(
[4] => 4
)
Comparison:
1. Very similar except that my select is on the node_field_data table.
2. No join necessary with my version.
3. This actually requires a join in the new implementation, and this query is currently unfortunately very common (loadByUuid()). But the current implementation joins itself which is likely not much better?
4. A single join instead of *3*
Remaining tasks
* Figure out things that dont' work yet, pretty sure that translations will be one such case as I'm possibly skipping the language condition now in at least some cases on the base table.
* Is this an API change or not?
* Answer #39
* Answer #41
User interface changes
API changes
Generated queries change, which might break query alters?
Data model changes