Problem/Motivation
I have a view that lists nodes that have an optional reference field to another content type. After upgrading to D10 I found nodes with the reference field empty were missing from the result set (it worked as expected in D9). The intent of access control is to restrict access to sensitive content; it makes no sense to decline access to nothing!
The view base table is "Content" and it joins to another content type via a reference field (with "Require this relationship" unchecked).
I'm using Workflow Access module to limit access to the "request" type content via a Workflow field.
For users without "Bypass content access control" permission, NodeGrantDatabaseStorage() adds conditions to filter the secondary node_field_data table. That works fine so long as the referenced node is not null; if it is null the entire row is filtered out, despite it being a left join. (In D9 the secondary table was filtered in the join; in D10 it's in a where clause. - see the SQL below)
Steps to reproduce
Create two content types ("request" and "source") and add an optional entity referenced field "field_foo" to request referencing source content.
Add a node access control (I'm using workflow_access) but ensure rules allow access to all content.
Add a node of type "source".
Add a node of type "request" with field_foo referencing the "source" node.
Add another node of type "request" leaving the reference field empty.
Run the following query as a user with "Bypass content access control" permission to see both nodes in the resultset.
Run again with a user without that permission to see 1 result (only the one with the reference field populated).
$query = $database->select('node_field_data', 'n', []);
$query->leftJoin('node__field_foo', 'foo', 'n.nid = foo.entity_id AND deleted=0');
$query->leftJoin('node_field_data', 'n2', 'n2.nid = foo.field_foo_target_id');
$query->condition('n.type', 'request')
->condition('n.nid', [224214, 224635], 'IN')
->fields('n', ['nid', 'title'])
->fields('n2', ['nid', 'title'])
->addTag('node_access');;
$result = $query->execute();
In both cases the query should return two results because there is no access restriction to a null result on the join.
Compare D9 to D10
Below is the SQL generated by D9 and D10 for the above query under node access control.
-- D9 version returns 2 records
SELECT "n"."nid" AS "nid", "n"."title" AS "title", "n2"."nid" AS "n2_nid", "n2"."title" AS "n2_title"
FROM
"node_field_data" "n"
LEFT OUTER JOIN "node__field_foo" "foo" ON n.nid = foo.entity_id AND deleted=0
LEFT OUTER JOIN "node_field_data" "n2" ON (EXISTS (SELECT "na"."nid" AS "nid"
FROM
"node_access" "na"
WHERE ((("gid" IN ('0')) AND ("realm" = 'all')) OR (("gid" IN ('91', '93', '96', '100', '104')) AND ("realm" = 'workflow_access')) OR (("gid" IN ('739')) AND ("realm" = 'workflow_access_owner'))) AND ("na"."grant_view" >= '1') AND ("na"."fallback" = '1') AND (("n2"."nid" = "na"."nid")))) AND ((n2.nid = foo.field_foo_target_id))
WHERE ("n"."type" = 'request') AND ("n"."nid" IN ('224214', '224635')) AND (EXISTS (SELECT "na"."nid" AS "nid"
FROM
"node_access" "na"
WHERE ((("gid" IN ('0')) AND ("realm" = 'all')) OR (("gid" IN ('91', '93', '96', '100', '104')) AND ("realm" = 'workflow_access')) OR (("gid" IN ('739')) AND ("realm" = 'workflow_access_owner'))) AND ("na"."grant_view" >= '1') AND ("na"."fallback" = '1') AND (("n"."nid" = "na"."nid"))))
-- D10 version returns 1 record
SELECT "n"."nid" AS "nid", "n"."title" AS "title", "n2"."nid" AS "n2_nid", "n2"."title" AS "n2_title"
FROM
"node_field_data" "n"
LEFT OUTER JOIN "node__field_foo" "foo" ON n.nid = foo.entity_id AND deleted=0
LEFT OUTER JOIN "node_field_data" "n2" ON n2.nid = foo.field_foo_target_id
WHERE ("n"."type" = 'request') AND ("n"."nid" IN ('224214', '224635')) AND (EXISTS (SELECT "na"."nid" AS "nid"
FROM
"node_access" "na"
WHERE ((("gid" IN ('0')) AND ("realm" = 'all')) OR (("gid" IN ('91', '93', '96', '100')) AND ("realm" = 'workflow_access')) OR (("gid" IN ('739')) AND ("realm" = 'workflow_access_owner'))) AND ("na"."grant_view" >= '1') AND ("na"."fallback" = '1') AND (("n"."nid" = "na"."nid")))) AND (EXISTS (SELECT "na"."nid" AS "nid"
FROM
"node_access" "na"
WHERE ((("gid" IN ('0')) AND ("realm" = 'all')) OR (("gid" IN ('91', '93', '96', '100')) AND ("realm" = 'workflow_access')) OR (("gid" IN ('739')) AND ("realm" = 'workflow_access_owner'))) AND ("na"."grant_view" >= '1') AND ("na"."fallback" = '1') AND (("n2"."nid" = "na"."nid"))))
Interestingly the D9 version returns both records when run against a D9 OR a D10 database. The D10 version incorrectly returns just one record.
Proposed resolution
This seems to be a regression in NodeGrantDatabaseStorage class so I assume the fix will be there.
Sites can use the diff from
π
node_access filters out accessible nodes when node is left joined
Needs work
which applies cleanly to 10.2.x