1. Create and publish a content entity.
2. Add a translation for the node and publish it also
3. Using adminer or mysql CLI examing the node_field_data for the node.
select * from node_field_data where nid=17621
For example,
nid vid type langcode title uid status created changed promote sticky revision_translation_affected default_langcode moderation_state content_translation_source content_translation_outdated rh_action rh_redirect rh_redirect_response published_at publish_on unpublish_on
17621 1177975 article de Cloud, AI and integration for today’s digital economy 1 1 1514324112 1514324319 0 0 1 0 published en-us 0 bundle_default NULL 301 1492179338 NULL NULL
17621 1177975 article en-us Cloud, AI and integration for today’s digital economy 0 1 1492179338 1514324128 0 0 NULL 1 published und 0 bundle_default NULL 301 1492179338 NULL NULL
This looks good and is what is expected: A published version of the source language and a published version of the translation.
4. Add a new optional text field to the content type and mark it as "users can translate this field"
5. Edit the published translation, fill in a value for the field and publish the translation
The expectation is that now there is a published source language document without a value for the new field and a published translated language document with a value for the new field. Examining the tables manually, we can confirm this is the case.
SELECT * FROM `node__field_article_new` LIMIT 50
Modify bundle deleted entity_id revision_id langcode delta field_article_new_value
edit article 0 17621 1177976 de 0 GERMAN NEW
So far so good.
Now, we want to write an entity query that will return to us all entities (including translations) that do *NOT* have a value for field_article_new
. So we try something like this.
use Drupal\Core\Database\Database;
$query = \Drupal::entityQuery('node')
->condition('type', 'article')
->notExists('field_article_new')
->execute();
echo "REVID ENTITYID\n";
foreach ($query as $revid => $nid) {
echo "$revid $nid\n";
}
echo count($query) . " results found\n";
We would expect that the published source language doc will match since it does not have a value yet for this new field and that the published translated language doc will not match since it now has a value for this field.
The published source language doc is not returned in the query results.
Examining the query that entityQuery produces behind the scenes for us shows the following for the above entityQuery:
SELECT base_table.vid AS vid, base_table.nid AS nid
FROM
{node} base_table
INNER JOIN {node_field_data} node_field_data ON node_field_data.nid = base_table.nid
LEFT JOIN {node__field_article_new} node__field_article_new ON node__field_article_new.entity_id = base_table.nid
WHERE (node_field_data.type = :db_condition_placeholder_0) AND (node__field_article_new.field_article_new_value IS NULL)
With respect to the entity under test, simplifying the query for informational/demonstration purposes (select * instead of just nid/vid and hardcode our article bundle), and pop it in adminer, we notice that our node 17621 is not in the results at all.
SELECT *
FROM
node base_table
INNER JOIN node_field_data node_field_data ON node_field_data.nid = base_table.nid
LEFT JOIN node__field_article_new node__field_article_new ON node__field_article_new.entity_id = base_table.nid
WHERE (node_field_data.type = 'article') AND (node__field_article_new.field_article_new_value IS NULL)
Now if we take off the IS NULL condition and instead add in node_field_data.nid = 17621
so that we are just looking at our node of interest, we see the following.
nid vid type uuid langcode nid vid type langcode title uid status created changed promote sticky revision_translation_affected default_langcode moderation_state content_translation_source content_translation_outdated rh_action rh_redirect rh_redirect_response published_at publish_on unpublish_on bundle deleted entity_id revision_id langcode delta field_article_new_value
17621 1177976 article 5972bfd5-4210-4d5a-b527-13c78017fcea en-us 17621 1177976 article de Cloud, AI and integration for today’s digital economy 1 1 1514324112 1514324319 0 0 1 0 published en-us 0 bundle_default NULL 301 1492179338 NULL NULL article 0 17621 1177976 de 0 GERMAN NEW
17621 1177976 article 5972bfd5-4210-4d5a-b527-13c78017fcea en-us 17621 1177976 article en-us Cloud, AI and integration for today’s digital economy 0 1 1492179338 1514324128 0 0 NULL 1 published und 0 bundle_default NULL 301 1492179338 NULL NULL article 0 17621 1177976 de 0 GERMAN NEW
Now the problem is evident. Because of the LEFT JOIN condition in the query, it *looks* like the published EN article has a value for field_article_new_value and so it is not returned.
IMO, the problem here is the join condition does not also join on langcode
. If we modify the entityQuery sql by adding the langcode in the LEFT JOIN, it fixes the problem:
SELECT *
FROM
node base_table
INNER JOIN node_field_data node_field_data ON node_field_data.nid = base_table.nid
LEFT JOIN node__field_article_new node__field_article_new ON node__field_article_new.entity_id = base_table.nid AND node__field_article_new.langcode = node_field_data.langcode
WHERE (node_field_data.type = 'article') AND (node__field_article_new.field_article_new_value IS NULL)
Now, the published EN row correctly shows NULL as its value for field_article_new_value. If we then go back to the original query entityQuery() uses and just add the join condition, our node 17621 shows up in the results.
Of course even if entityQuery did this and the entity/revision was returned, we still would not know which translation(s) for that revision matched and which did not since langcode is not reflected in the result has. Not sure how that would be indicated. Maybe by making the results keyed by vid and then langcode and then nid - although this would be a breaking change.
Perhaps entityQuery is not the right tool to use for this job? The goal being "Give me all published 'foo' entities (including translations) that do not yet have a value for the new field 'foo_new'. Is there a different way to go about this? We try not to write sql queries directly against the tables, favoring the API instead, but perhaps in this case it is our only route?
Not all content is available!
It's likely this issue predates Contrib.social: some issue and comment data are missing.