🇧🇪Belgium @kristiaanvandeneynde

Antwerp, Belgium
Account created on 31 May 2011, over 14 years ago
#

Merge Requests

More

Recent comments

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

Looks good to me.

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

Code looks good to me and so does the outcome. I do see some test failures, but not sure what those are about. Will RTBC if someone can explain why the overall tests claim everything is fine yet the right-most pipeline is full of red.

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

Okay, good to know. Given the timings on my local it seems like the join is not suffering from it. Is there anything else I can post to confirm this?

Here's an explain query:

 EXPLAIN  SELECT nfd.vid, nfd.nid FROM node_field_data nfd INNER JOIN ( SELECT nid, MAX(vid) as maximum FROM node_revision GROUP BY nid) nr ON nfd.nid = nr.nid AND nfd.vid=nr.maximum;
+----+-------------+---------------+------------+-------+--------------------------------------------------------+-----------+---------+-------------------+-------+----------+--------------------------+
| id | select_type | table         | partitions | type  | possible_keys                                          | key       | key_len | ref               | rows  | filtered | Extra                    |
+----+-------------+---------------+------------+-------+--------------------------------------------------------+-----------+---------+-------------------+-------+----------+--------------------------+
|  1 | PRIMARY     | <derived2>    | NULL       | ALL   | NULL                                                   | NULL      | NULL    | NULL              | 23016 |   100.00 | Using where              |
|  1 | PRIMARY     | nfd           | NULL       | ref   | PRIMARY,node__id__default_langcode__langcode,node__vid | node__vid | 8       | nr.maximum,nr.nid |     1 |   100.00 | Using index              |
|  2 | DERIVED     | node_revision | NULL       | range | node__nid                                              | node__nid | 4       | NULL              | 23016 |   100.00 | Using index for group-by |
+----+-------------+---------------+------------+-------+--------------------------------------------------------+-----------+---------+-------------------+-------+----------+--------------------------+
🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

Re #10 about as fast as what I suggested. So HEAD is definitely slower than the MR or my suggestion here right now.

But the MR here drops the comparison between a column from the subquery and a column from the outer query so it should not run into the correlated subquery thing at all. You may just as well run the following query:

SELECT nr.vid, nr.nid FROM node_revision nr
INNER JOIN node_field_data nfd ON nfd.nid = nr.nid
WHERE nr.vid IN (SELECT MAX(nr_inner.vid) AS vid FROM node_revision nr_inner GROUP BY nr_inner.nid);

Keep in mind you can't use the nr alias inside the subquery or it might trigger the correlated subqueries after all.

So I now have the following on my DB:
This MR, with adjusted table aliases.
SELECT nr.vid, nr.nid FROM node_revision nr INNER JOIN node_field_data nfd ON nfd.nid = nr.nid WHERE (nr.vid IN (SELECT sq.* FROM (SELECT MAX(nr2.vid) AS "expression" FROM node_revision nr2 GROUP BY nr2.nid) sq));
Average of 0.09s, sometimes going lower to 0.06s

This MR, dropping the extra subquery:

SELECT nr.vid, nr.nid FROM node_revision nr INNER JOIN node_field_data nfd ON nfd.nid = nr.nid WHERE nr.vid IN (SELECT MAX(nr_inner.vid) AS vid FROM node_revision nr_inner GROUP BY nr_inner.nid);

Average of 0.09s, sometimes going higher to 0.13s. So maybe some DB engine trickery is still going on, even though it shouldn't.

My suggestion, using a regular JOIN:
SELECT nfd.vid, nfd.nid FROM node_field_data nfd INNER JOIN ( SELECT nid, MAX(vid) as maximum FROM node_revision GROUP BY nid) nr ON nfd.nid = nr.nid AND nfd.vid=nr.maximum;
Average of 0.04s, never going higher than 0.07s.

So we'd be using a simple join rather than subqueries, which should already be faster, and we would not have to worry about any DB specifics regarding how these subqueries are evaluated.

Side observation:
Also, my suggestion only joins those rows on the highest vid, where the 2 other queries seem to still select all nodes because this part is always true: WHERE (nr.vid IN (SELECT sq.*. You'd then have to select the row with the highest vid yourself (nr.vid > nfd.vid).

If you change that to nfd.vid, the result set shrinks to what my INNER JOIN returns. Not sure if that's the desired outcome, changing my code to a LEFT join gives the same results as HEAD and the current MR.

We may want to look into what we exactly want from the query here. As that might also be a point of optimization (shrinking the result set).

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

Tried it a few more times, HEAD is always ~1 second and my version is always between 0.02 and 0.07 seconds. Tried my version with both INNER and LEFT join for good measure.

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

1.01 seconds (HEAD) vs 0.02 seconds (my suggestion)

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

On a DB with 266k entries in node_revision and 24k rows in node_field_data, the above suggestion ran in 0.02 seconds and gave me the correct max vid for every row in node_field_data.

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

So far we had:

  1. A join that blew things up when entities had many revisions
  2. A subquery with GROUP BY that blew things up when a DB had many entities
  3. A subquery without GROUP BY that fixes the above, but is at risk of running correlated subqueries

Reading up on correlated subqueries in a very recent article, they advocate using joins to avoid performance issues. That would bring us full circle.

I'm not sure we want to use a workaround to avoid a deliberate feature of SQL. I'd rather ask myself if we're even supposed to use said deliberate feature here.

Circling back to the query from the MR, after adjusting some names for readability, we get this:

SELECT nr.vid, nr.nid FROM node_revision nr
INNER JOIN node_field_data nfd
ON nfd.nid = nr.nid
WHERE
  (nr.vid IN (
    SELECT "sq".* FROM (SELECT MAX(nr.vid) AS "expression" FROM node_revision nr GROUP BY nr.nid) "sq")
  )
  AND nfd.nid = "1"

Some technical observations:

  1. I don't think this would trigger a correlated subquery because there is no link to the outer table. For that to happen, we would need to add WHERE nr.nid = nfd.nid to the subquery.
  2. We are effectively using a query that would get the max vid for every node, by using a subquery, even though all of this information is in one of the tables we're already using (node_revision). So if we could eliminate "node_field_data" in this case (not sure because entity query), then the query would become really simple: SELECT nr.nid, MAX(nr.vid) AS vid FROM node_revision nr GROUP BY nr.nid WHERE nr.nid = 1
  3. Now I think the WHERE clause on the nid might be a red herring because not all entity queries might have that condition. So in that case the above simplified query would not work because we might not have all the condition fields on the node_revision table
  4. If we stick with the subquery, and add the where clause to it, we need to realize that it is optimized by the DB engine because it's usually used to generate reports (as shown in the article: Employees earning more than the average salary in their department). Wrapping it in another query to prevent the DB engine from optimizing it seems like we're kicking the can further down the road and we will get bitten by DB engine updates again at a future time.

All of this just screams to use a join to me, like we originally did. But we should optimize it so the thing we're joining is far smaller than it was before. So something like:

SELECT nfd.nid, nfd.whatever_you_want FROM node_field_data nfd
INNER JOIN
( SELECT nid, MAX(vid) as maximum FROM node_revision GROUP BY nid) nr
ON nfd.nid = nr.nid AND nfd.vid=nr.maximum
🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

Okay so reading up on Slack and the issues over the weekend, I want to get the most important thing out of the way first: I am glad people are looking after my software in my absence. Let that be the key takeaway here, regardless of what I write next. So thanks, everyone!

With that said, the only thing I was a bit surprised about was the speed and urgency this was dealt with. It's my understanding that dev releases are allowed to break and can never be critical unless someone spotted a supply chain attack or something that needs to be addressed ASAP. So seeing this issue marked as Critical and resolved before I could have a chance at reviewing it did make me raise an eyebrow. The "Seems better to ask for forgiveness than permission" mantra has a time and place for it, but I don't think the building was that much on fire here.

Still, it's my mistake to begin with for adding the OOP hook backport to previous versions without realizing that those were no longer testing on D10, even though that version is still supported. I'm sure there's an excuse to make about not having as much contrib time as I used to, but at the end of the day it's still my responsibility to make sure all checks have passed.

Nice catch finding the "group_modules_installed" mistake, by the way. I also appreciate that you went and "put out the fire" but left the phpstan stuff for another issue as there is indeed less pressure to fix those warnings.

So all in all, I'm more happy than annoyed about what happened. For the first time in over a decade I've got the feeling that I'm not alone in looking after Group. So that heavily outweighs the fact that I feel the urgency was a bit exaggerated here :)

Either way, thanks!

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

You're right, I missed that.

This is largely due to the fact that I consider 3.3.x and 2.3.x the final branches for those majors, but I should have given it more thought. The main reason I allowed this (sponsored) feature request to go in is that I've been delayed in releasing v4. Otherwise I would have directed people to use that version if they wanted OOP hook support.

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

Look, I'm fine with keeping the name as is. I've raised my objections and so be it.

But you absolutely can add new named parameters to existing attributes, I've added my own https://3v4l.org/egZQ2 to prove this. In a world where people were already using the attribute with "name", you absolutely can add an optional "type" without any BC issues later on.

In what world would someone download Drupal 1x.x, see that there's now a "type" named argument, use said argument and not bump the minimum required version for their module to whatever Drupal version they just saw the new named argument in? That does not make sense.

And any old code using only the "name" named argument would still work on said newer version of Drupal.

I really don't get where this insistence comes from that we cannot add named arguments to existing attributes. You absolutely can, provided you make them optional.

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

MR keeps old behavior as fallback, but fetches correct prefix from plugin definitions when possible.

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

kristiaanvandeneynde created an issue.

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

I was talking about type it is a parameter we don't have, and we're speculatively adding it here.

I'm not sure we should add something just in case we need it since it won't do anything.

That is not what I am saying? Nowhere did I write we should already add "type" now. I even said there's a good chance we will never add it at all.

We can always add HookDependsOnTheme. Or HookDependsOnCallback, or HookDependsOnFullMoon or whatever ... it doesn't need to be just one.

I also considered that, but that just feels wrong? You'd have multiple attributes with the same meaning aside from a small implementation detail.

We don't go and add EntityType, EntityTypeWithOwner, EntityTypeWithOwnerAndPublished, EntityTypeWithOwnerAndPublishedAndRevision, etc. to core either. In that example we do have two attributes (ConfigEntityType and ContentEntityType), but they actually are wildly different behavior and somewhat different attribute parameters.

Either way, I've said my peace. The way core is evolving nowadays, I don't think it's unimaginable that one day something other than a module can call a hook. I'd hate to be stuck with an attribute that is named too narrowly when that day comes. I've learned the hard way that naming things too specifically because you never thought you'd support more than what is currently the case can really bite you in the ass.

MR looks fine otherwise, my only concern is the name of the attribute.

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

But I'm not saying we should add any new params? And what I'm suggesting won't cause any fatals.

All I'm saying is let's not marry ourselves to the scope of modules in the name of the new attribute. If we don't widen the scope, so be it, if we do then we don't have to rename the attribute.

Essentially: #[HookWithDependency(name: 'some_module')]

If we never widen the scope then that's all it will ever be. If we do widen the scope, then we can have:

  • #[HookWithDependency(name: 'some_module')]
  • #[HookWithDependency(name: 'some_theme', type: 'theme')]
  • #[HookWithDependency(name: 'some_module', type: module')]

The last one not being necessary because the default value of "type" would be "module". Again, if we ever even get to that.

But at least we won't have: #[HookDependsOnModule(name: 'some_theme', type: 'theme')] then.

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

Re #50, I saw #23 and https://3v4l.org/31a3W.

What I had in mind was this: https://3v4l.org/IZSKF

We start with "name" and add "type" later if we ever get to that. No fatal error, even if people are already using the attribute.

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

I love this.

The fact that we could already define and alter the container from our tests was great, but we were still restricted by having to write test modules for hook implementations and then "talk to" those modules using state. This issue means we can encapsulate even more test-specific logic in the test file itself rather than having to figure it out when trying to read someone else's (or even your own) test.

The fewer files I need open to write, analyze or update a test, the better. Will make MRs also far easier to review. Huge +1

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

I feel like we might be shooting ourselves in the foot here with the naming. If we are deliberately going forward with the HookDependsOnModule name, we are ruling out that anything else may specify a hook in the future, e.g. a theme.

I'd rather see a more agnostic name such as HookWithDependency, where we keep the param called "name". If at any point in the future we decide to support hooks anywhere other than modules, we can keep the attributes as they are, perhaps introducing a second "type" parameter which defaults to "module" for BC reasons, and all we'd need to do is change the collector pass to check for more than just "$this->modules".

Either way, just my 2c. I've learned the hard way that naming things too specifically tends to bite you in the ass at some point.

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

Looks good to me, a bit annoyed that we're now stuck with cache.backend.memory.memory for all eternity because at one point we used to have "cache.backend.memory" and got rid of it.

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

Hmm, in this case the fix is simple and you only needed to move an array initialization into an else-block to avoid tripping up the initial isset() check. But not every system will be so easily adjusted to avoid fibers-induced race conditions like these.

However, by moving the array initialization, doesn't that solve the issue we had in the first place? I.e.: Do we still need to suspend the fiber then? I don't think the issue persists if you no longer initialize $this->taskData[$route_name] outside of the route attribute check. Right?

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

Added to menu, thanks.

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

Thanks for the explanation, I'll have to take a closer look then.

Will you be in Vienna next week? If helpful, we could also discuss this there.

No, sadly not. I fully intended on going but then I had a career change and I wasn't sure how well everything would be going by the time Vienna came around, so I took the safer option and decided to sit this one out.

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

The issue I have with node grants, and why Group does not support it, is that it only works for nodes. Instead, Group leverages the "entity_query" query tag to add access for all entity types to queries on your site.

@drunken monkey, is it then safe to assume that search_api does not leverage query access for entity types other than nodes?

I had a brief look at the "Content access" plugin and it seems to add the grants to the index? In that case, facet counts won't work with Group or any other entity type leveraging query access as the node entity type is the only one using a grants system.

One way around this that I can think of is to grab the resulting IDs from the index, run an entity query with access checks turned on and then use the intersection to show results and counts. Not sure how easy that would be to pull off, though.

@korinna.mucsi Nothing has been agreed upon yet. We're still in the process of discussing things.

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

Different approach, actually check for NULL. Will open an MR.

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

Adjusted the MR to pass phpcs and phpstan. Can confirm this seems to work on my end. The tests are failing, but that was to be expected as the asserts have not been adjusted to the new code.

Also, the fact that it's a unit test complicates things now that we're using the query condition group. So either we mock that whole thing or witch to a kernel test.

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

kristiaanvandeneynde made their first commit to this issue’s fork.

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

kristiaanvandeneynde created an issue.

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

Thanks, committed. Also backported to 2.x, should have taken care of all the group_content machine names while doing so.

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

I'm going to close this as outdated. The test in question is almost entirely empty nowadays and, as you mentioned, runs fine right now.

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

That just moves the problem to Opensearch itself :D

https://docs.opensearch.org/latest/vector-search/ai-search/hybrid-search...

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

Also present in latest version

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

kristiaanvandeneynde changed the visibility of the branch 3548663-queryparambuilder-setting-a to hidden.

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

kristiaanvandeneynde created an issue.

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

Added a cherry-pick branch, didn't give any conflicts on the files that matter so that would make me assume we did not change any code between 3.3.x and 4.0.x in hooks. Still needs to be verified, though.

Also might need to bump minimum D10 version so the LegacyHook attribute is supported.

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

I think there might be an issue with this approach, since queries often happen during the render array build and before the actual render.

Sorry, you're right. I was living in a perfect world where (almost) everything is a placeholder or pre_render callback. Something I'd like to see one day :)

But that could still work, though. If we know we're building something renderable, we could call the service in advance with the cache keys that we are setting on the very thing we're building. And then everything I said in #21 could still happen. I.e.: We'd be able to know exactly which metadata came from which render array and attach it back when going through the render process.

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

kristiaanvandeneynde created an issue.

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

Also just realized v3 still supports D10, which had no OOP hooks. So that might complicate things a lot.

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

I like #19, but we indeed can't wait until the very top level to add that cacheable metadata back in like @catch said in #20.

What if we add a service like in #19, but call and flush it whenever we're about to attempt to write to a cache in Renderer? This means that we would be able to add the metadata to the very render element that triggered the query in the first place. It could then bubble up or not just like it would for any normal render scenario.

E.g.: We have a page with a region with a block with a view with some nodes. The view triggers a query, so when we start rendering that the service knows the current cache key is the view's and when Renderer gets to trying to store the view with said cache key, our service returns metadata which we add to the render array.

It then gets written to the cache (or not) and bubbles up (or not) based on normal Renderer flow. Only thing to look out for is that we do not poison this system by adding cacheable metadata to db cache lookup queries as that could get really ugly.

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

Rather than trying to convert them all over again, could we not take 📌 Convert hooks to OOP hooks with attributes Active and backport that? The only thing to look out for then is features that were new in v4; we definitely should not introduce those in older versions.

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

The maximum number of seconds allowed to run indexing per index. Batches that were started before the limit was reached will run to completion and may therefore slightly exceed the limit. Defaults to -1 (no limit).

Is this better? It seems more clear about the potential impact.

Either way, feel free to commit your or my version without further feedback. I consider this issue RTBC. I'm not a fan of delaying things over nitpicking how we should phrase a comment :)

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

Okay, so I can definitely see the problem space. Search API indexes things and if you build a view that shows indexed fields, you're not going to run an entity query on e.g. nodes, thus not allowing Group to kick in.

I just don't think the current approach is something I'd want to maintain. I'm trying to get rid of node grants in core, so why would I add a similar system to my own module? :)

IIRC, search API runs a query once it has its results. I'd love to explore whether we can add onto that query somehow so that the tag that allows group query access to work can be added. But then you could argue whether search_api might be the better place for this to live in, as all access modules could benefit from an 'entity_query' tag or something being added to said search_api query.

Could you reach out @drunkenmonkey and ask for advice? I feel like we need to explore our options here before we can commit to an approach. It's good that you reached out to me so that no more effort went into the current approach before we exhausted all other options.

Thanks!

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

Used AI to assist with the tedious bits. It got a few things wrong, which I manually corrected.

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

kristiaanvandeneynde created an issue.

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

6 minutes 37 seconds vs 17 minutes 59 seconds. I guess that's faster :)

Fixing a line ending and committing.

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

kristiaanvandeneynde created an issue.

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

Looks good to me.

Only nitpick I have is that the time limit is allotted for every index if you specify no index. You might want to document that as such. If you have 6 indexes and specify a time limit of 10 seconds, it will end up running for 60 seconds unless you manually specify a single index.

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

We're not using it with a database backend, if that helps. We're using it with www.drupal.org/project/search_api_opensearch

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

The MemoryCacheInterface is certainly weird and wrong. I think we should require that caches are explicitly specified with an attribute when using autowire. Changing it seems harder than just deprecating it. Lets do that and see if that causes any issues with autowired injections?

+1, we should probably not alias cache interfaces for autowiring as too many things can implement the same interface.

You said not a "real" cache, I think that's not a very meaningful description.

I agree, yet it's the distinction we had to make when we made MemoryCache more widely available and why they are tagged as cache.bin.memory rather than cache.bin. For all I care and for how we use them, they are caches and they replace static property caching perfectly. It's just that they don't fully behave as the other caches in core and, for that reason alone, we have to keep the distinction clear to avoid confusion and mistakes when people try to use them.

Looking at core, the only use seems to be PermissionsHashGenerator and we can just switch that over and get a tiny improvement out of that. Therefore, I'm considering to deprecate cache.static here.

Tentative +1, could be a pain in the ass for tests.

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

MR looks good to me, although we currently alias

Drupal\Core\Cache\MemoryCache\MemoryCacheInterface: '@entity.memory_cache'

for autowiring, but people might expect that to point to the new cache.memory from the MR.

Also @catch had a point with:

We just need a convention/pattern for cache IDs.

So far, in core, we have:

  1. cache.jsonapi_memory
  2. entity.memory_cache

I know Group has cache.group_memberships_memory

So there's no clear naming pattern, but we should ideally have one to indicate the different between regular caches and memory caches. Keep in mind that memory caches aren't real caches because they return the same object over and over again rather than new ones on every get.

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

If someone has an access policy with the user.permissions cache context, then the whole thing would explode.


/**
 * Defines the access policy interface.
 *
 * ...
 *
 * Do NOT use any cache context that relies on calculated permissions in any of
 * the calculations as you will end up in an infinite loop. E.g.: The cache
 * context "user.permissions" relies on your calculated Drupal permissions.
 */
interface AccessPolicyInterface {

:)

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

Whoops, forgot to refresh before posting, but keeping at NW for the test fails and my suggestion.

I love the language thing in the MR. That's definitely something we could keep. Perhaps we could add a method on the base class so that all access policies can benefit from this? Any access policy that adds a dependency might get bitten by this for no reason.

As far as the user cache tag goes, I suppose that's a clever way to optimize things. But there are many a thing related to user entities that could affect access policy outcomes.

E.g.: An external access provider is given your user's e-mail address to determine access. Throughout a request this value changes, but your user was never updated, so the cache tag was not invalidated. Now what?

So on the one hand I also love that change and the performance improvement that comes with it, on the other hand I feel like we need to consider that one a bit more carefully than what you did with the language cache context.

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

Similarly, this isn't about old and new VariationCache implementation. It's about now actually using it for the static cache where previously we had nothing like that. VariationCache performance was never great, but it's less of an issue when it's used for a persistent cache that's much slower and less frequently used. 150 calls and 15% of total execution time is massive and imho a major performance issue (with the caveat that it's maybe only 10% or even 5% without blackfire, very hard to tell, but it's definitely something)

This is true, we never really had the chance to statically cache variations and now we do. Technically speaking it's the only safe way to statically cache something that came from a variation cache, though, as we discussed on Slack.

We discussed this quite a bit in https://drupal.slack.com/archives/C1BMUQ9U6/p1756840947127769, and the conclusion is essentially that using a simple static cache here is a possible source for access issues if the policies can vary for the same using during the same request in a way that is not covered by cache tags (e.g. user or role save). Given that this is a possibility, we shouldn't have non-variation static caching on top of a variation static cache. Of course just removing that would make this issue even worse (and add 50% more calls to processAccessPolicies(), so we can realistically only do that if we also come up with a massive performance improvement for that.

Great summary and fully agree.

Improve the performance of VariationCache in general, specifically the cache id calculation. Not sure how.

I thought about this too, one of the key bottlenecks is that we have to go to cache contexts all the time while following a redirect chain. Cache contexts are supposed to be fast, but that is not always the case. Now we can't outright cache cache context values for the entire request, but perhaps we could while following a redirect chain?

I.e.: If we are following a chain that early on uses cache context "foo" and said context is quite expensive, perhaps we could somehow cache the value of "foo" throughout the creation of a redirect chain and make the CacheContextManager aware of that value?

This seems entirely safe to do, as one could argue a cache context changing value while you're following a redirect chain would be really bad as the value to be cached is no longer represented by its eventual cache ID. E.g.: The UserRolesAccessPolicy calculating one thing and before said calculation gets cached, the roles have changed. That would mean someone deliberately saved roles from within another access policy, which is really bad and should not happen.

So, again, it would be safe to say that, as you're building a redirect chain, you should keep using the same cache context values until the end of the chain is reached.

I feel like this could be really promising to explore.

Figure out a rule on when we can skip the variation cache in favor of a simpler per-user static cache. Essentially something like \Drupal\Core\Session\AccessPolicyProcessor::needsPersistentCache() but for variation caching. We'd ask access policies whether they are free from outside/dynamic influences. Given the same user, it will return the same user. SuperUserPolicy is very obvious, UserRolesAccessPolicy as long as the user or roles don't change, which is mostly covered by cache tags.

Worth investigating, but might run into issues where it's hard to predict outside influences.

But by default (with just core), we don't even do persistent caching anymore. For the static cache, I think that's the wrong goal. Multiple users in the same request is an edge case, we shouldn't optimize for the edge case.

So if we add the uid to the cache keys, at least for static cache, then we can drop all user.xy cache contexts. Only problem I can think of right now is that you could alter roles on a user entity without saving that user (saving we could handle with user:ID cache tag), but I don't think that's something we need to support?

Not entirely sure I can follow here. I agree with the things you say we shouldn't support, but I don't fully see how this would be a solution.

That still leaves us with cache redirects. Because the other reason that this is so slow is having to do all of this multiple times to resolve redirects. On multilingual sites, you always end up with languages:language_interface. which is irrelevant for permissions/access, but the user label has possibly been translated. That's the risk of addCacheableDependency(), you often get stuff that you don't need and it's tricky to know what you really need. we should either switch to ->addCacheTags($user_role->getCacheTags()) in UserRolesAccessPolicy or explicitly filter that out.

If we optimize cache redirect following with my suggestion above, we might not need to filter out anything. Each VC lookup will go to every involved cache context only once then, regardless of how long the chain is.

That might be safer, in theory you could have something like per-domain user permission overrides, although thinking about that hurts my brain.

This is actually not that hard to pull off and a valid use case I've seen implemented using the access policy API.

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

Just ran into this bug myself using the entity_reference_revisions module.

If you have a base field called "category" and use a field type of "entity_reference", then the shared table will only contain one column called "category" and any entity query with a condition of "category.entity.name" will work because of the current code in Tables using the chunks as a guide to the column name.

If you use an entity_reference_revisions field, your shared table will have two columns: category__target_id and category__target_revision_id. Now the code described above will try to find a field named "category" because it's parts of the chunks and fail.

However, any entity query that simply queries for "category" will work, because Tables has separate code for that:

        // If there are more specifiers, get the right sql column name if the
        // next one is a column of this field.
        if ($key < $count) {
          // THIS BROKEN CODE DOES NOT RUN IF ONLY ONE CHUNK
        }
        // If there are no additional specifiers but the field has a main
        // property, use that to look up the column name.
        elseif ($field_storage && $column) {
          // THIS CODE WILL CORRECTLY FETCH THE RIGHT COLUMN NAME
          $columns = $field_storage->getColumns();
          if (isset($columns[$column])) {
            $sql_column = $table_mapping->getFieldColumnName($field_storage, $column);
          }
        }

The ERR module's queue has a few issues complaining about this bug, but none of them seem to correctly point to this core issue.

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

Not adding new test coverage, but I fixed the existing tests. Especially the missing schema would make this patch throw exceptions when used on an actual project.

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

kristiaanvandeneynde made their first commit to this issue’s fork.

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

Found a minor issue when passing a time limit of 0. The code seems to specifically allow for this, but then any time limit of 0 is converted into -1.

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

Need more time to respond properly, but just to clarify ::get*Persistent*CacheContexts() has nothing to do with the type of cache. It means that these cache contexts are always there for that access policy, with more cache contexts optionally being added during its calculatePermissions() or alterPermissions() methods.

It's also often referred to as initial cache contexts, but that implies you need to add more, so I/we chose getPersistentCacheContexts as a name.

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

As to the IS and the proposed solution: I agree that if we use a static cache for the hash, we could just as well use one for the cacheable metadata as both rely on the same source. So if we were to go with that approach, consider me on board.

What @berdir wrote in #5 is also true.

A few things I slightly disagree with, though:

As I've mentioned in the redirect issue, the flexibility we gained (which core by default doesn't use) comes at a high price.

There were several things in core that were truly broken with the old redirect system because it combined all possible variations on the second level. This meant you could not use highly variable cache contexts on common page elements such as local tasks because it would make every page uncacheable. With VariationCache, this is no longer an issue.

VariationCache is very slow

Performance tests at the time showed that VariationCache wasn't necessarily slower than the old system either. Depending on the scenario it was even faster.

I wonder if we can add one more check like \Drupal\Core\Session\AccessPolicyProcessor::needsPersistentCache() and somehow avoid the variationCache here when we don't expect anything to actually *vary*.

Worth exploring, but it speaks for itself that we need to tread very carefully there. If we make a mistake in VC while trying to save resources, we might introduce all sorts of security issues.

Would it maybe make more sense to cache the processAccessPolicies() call if we decide it's worth it? createFromObject() should be fast and then we actually reduce 2 calls down further to just one?

The access policy processor has to check at least two static caches to return your access policies: A regular one for cache contexts and a variation cache for the actual result. So if we can turn that into one that would indeed be a win. The question becomes how fast the hashing is vs the static variation cache lookup fewer.

But at that point we'd be putting a regular static cache in front of the caches in AccessPolicyProcessor just to bypass the static variation cache lookup in APP. If we're going to go ahead and do that, we may as well do the same in the PermissionChecker and further reduce the amount of calls to APP. As Berdir's findings shows that PermissionChecker is the other major driver of traffic towards APP.

This does mean that we'd be introducing an extra regular static cache lookup we know will miss for every first permission check on a request.

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

There we go, easy fix and confirmed to work on my end. Let's see what tests have to say. Also might need a test to confirm it works and keeps working.

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

Just got bitten by this too.

Had a few manually crafted links (global text rewritten as link). It works while showing the links in the view directly, but when you hide them and put them in the dropbutton, they lose the first part of their path because of the code linked in #5. The hard-coded forward slash gets prefixed to the one coming from the link, leading to an internal url like internal://node/1, which makes pares_url() think the host is "node" and the path is "/1"

Not sure how you could consider this working as designed, though :) At least we could check if $path starts with a forward slash and strip that before feeding it into UrlObject::fromUri('internal:/' . $path)?

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

Then I type "gr us" or "gro us" or whatever :) PhpStorm is really good at autocompleting for you.

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

Moving back to normal as I found out Search API will always convert whatever data that came from the index into arrays. Which raises the question whether we need to even bother storing the data as scalars.

The field we were struggling with was being added by a views data alter gone rogue.

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

I just did some more digging and this cannot be solved in only this module. I implemented IndexParamsEvent to make sure my index had actual scalars in it and verified that using the opensearch dashboard.

But Search API still uses arrays as ResultRow properties, because of SearchAPiQuery::addResults() relying on a bunch of Drupal\search_api\Item\ItemInterface, which in turn contain a bunch of \Drupal\search_api\Item\FieldInterface. The latter can only contain array values and this module's QueryResultParser::parseResult() correctly feeds it that.

So even if we store scalar data in the index, Search API will still pass it around as arrays once retrieved from the index. Which means the problem from my previous comment persists.

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

I just ran into this as my view was breaking.

What happens is that, if you configure a view for an index named Foo, to use the field "Link to InsertEntityTypeName" under the category "Index Foo", the view crashes.

This is because said field will eventually end up calling for the "langcode" property of the row in Drupal\views\Entity\Render\TranslationLanguageRenderer::getLangcode() with a $this->langcodeAlias as "langcode"

  /**
   * {@inheritdoc}
   */
  public function getLangcode(ResultRow $row) {
    return $row->{$this->langcodeAlias} ?? $this->languageManager->getDefaultLanguage()->getId();
  }

Because $row->langcode is an array containing a single langcode rather than a string, the parent call from \Drupal\views\Entity\Render\EntityTranslationRendererBase::getLangcodeByRelationship() crashes because it has a return value type of string.

  public function getLangcodeByRelationship(ResultRow $row, string $relationship): string {

Fields from the category "Foo datasource" do not seem to suffer from this as they use search_api's EntityTranslationRenderer instead. But that's from my limited testing.

Long story short:

  1. Views expects strings in some places
  2. This module returns arrays of strings
  3. Weird crashes and painful debugging ensue

So I'd bump this to major given how it can completely break Views :)

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

Add a couple functions to the AccessPolicyInterface that requires the policy to return a list of the affected permissions

That might give false info as another policy could easily override that.

A description could be nice. So maybe start with an overview of currently active policies that show a name and description and then instruct the more technical people where to look if they want to see the code?

Either way, I'm a fan of an overview. I just don't want to overload access policies with methods for the sake of the overview; getLabel() and getDescription() could be nice, getAffectedPermissions() not so much.

You could use attributes, but those get discovered regardless of whether the policy is actually in use. E.g.: If module A has an access policy that module B swaps out, then there's no point in showing module A's version in the UI. With attributes, you would do just that, whereas with methods you would only load the active policies. (Swapping out or decorating access policies is an edge case and generally not recommended, but it's nice to keep the option open)

Moshe suggested you try putting this in https://www.drupal.org/project/security_review . You could start by listing the class names in your UI, see how that looks and then circle back here to request two new methods for more descriptive info

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

Attached MR doesn't fix anything, it merely reverts to calling the parent so we have a patch to use while otherwise using the latest version of WSE. Waiting for maintainer response to decide a proper fix before doing any real work.

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

I was asking myself: "Why isn't this using setter injection?" when I noticed this issue and saw that @godotislate did exactly that. So huge +1 for MR 69 as it's both forward and backward compatible. We should not be juggling with version_compare for this (MR 68).

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

I'm unsure where the notion of of custom $ops comes from exactly, do we have any data that supports that this is a thing people did?

No, and people shouldn't have. I'm trying to predict possible fallout from the changes. Like how we sometimes go the extra mile even though our BC policy says we are free to just change constructor arguments on some services: It's being nice. But I doubt we need to do that here.

The idea is that we isolate the code in a way that allows us to provide an alternative implementation using the new mail API, first in an experimental module, then later moved back to user (at least that's my recommendation).

Yeah, I got that and I like the idea. My suggestion was more that, when copying the old code into the (perhaps temporary) service, we might want to see if the comments are still up to standards. But if we want to keep the scope narrow, I'd be fine with that too. As you indicated: This code might not live very long.

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

My main concern was that someone who used to use custom OPs would now have that option taken away from them. But giving it some more thought, nothing stops them from calling the mail service with said custom OP themself. So I left a few nits left and right, but nothing too major.

The new service copies the old code in ways that seem to make sense, but also do not go any further. One could argue now would be a good time to revisit some of the comments, but if you want to limit scope to "just move the code" then I could get onboard with that too.

Moshe seemed okay with the approach taken here too (via Slack) and so am I. So as far as subsystem maintainers are concerned, you are 2 for 2. Thanks for working on this.

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

I just realized we're not taking list cache tags into account here. This could become a security issue if released like this. So we need to create a follow-up that makes sure we tackle list cache tag invalidation too. The only downside is that core makes EntityBase::getListCacheTagsToInvalidate() protected. Would have been really nice to be able to just call that.

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

One thing I just realized by reading this IS, is that 🐛 Node update hook is triggered upon group content create RTBC did not add list cache tags to its solution.

So if you had a list of 5 nodes you had access to due to how they were grouped and you lose that access, we are fine because the list was tagged with both the individual nodes' cache tags and node_list. However, if you gain access because a node got added to one of your groups, the list won't show it because we no longer invalidate list cache tags.

Now, to the issue at hand, hook_entity_access() does not apply to lists, so the facepalm I just had should not get worse if we do as the IS suggests. And as you mentioned, adding or removing an entity from a group will either save the entity (v2/v3) or invalidate its tags (v4). So a main menu block should always get updated when one of the nodes that are actually in the menu have their group status changed.

So I think you're right and we can completely remove that piece of code.

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

Okay, time to backport this.

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

https://github.com/sebastianbergmann/phpunit/pull/5225

Essentially, the following is no longer supported (see second note):

   * @depends testRevisionsTab
   * @dataProvider revisionsOperationsProvider

Unless we stop using named arguments in our cases.

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

Okay turns out there's more to it than simply updating the version. Let's do that in this issue.

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

Thinking of lifting the new editor flow where we use contexts into a minor release. Already getting rid of the wizard in the issue about not allowing automatic memberships outside of the form approach so that's one step in the right direction. Then 4.1 could contain more context provider support.

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

Good catch, can definitely accommodate this.

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

Also, will fix composer in a standalone issue on Monday. Don't need tests to run just yet anyway.

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

Okay so I did some debugging and found that during an access plugin's lifespan, the view is far from initialized as it saves a lot of resources to not calculate or render a view you don't have access to. But we do need some of the view to be loaded to know which contextual filters were configured.

The latest commits do just that: If, and only if, the GroupPermission plugin is configured to also check the view argument will we load the handlers for the provided display ID and see if any of them use our GroupId ViewsArgument plugin.

This all relies on $this->view->element being populated, which is usually the case when access() is called. But I have a feeling if anything about this approach is fragile, it's that little part. So curious to see what VBO does with $this->view->element, for instance.

Either way, needs an update hook to fix our default views to actually use the group_id plugin ID instead of numeric.

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

Okay, so I gave this some more thought and I want to document some findings for posterity:

Q: If we're going to allow the GroupPermission access plugin to use the value from the argument (contextual filter), then why don't we put the new stuff where you choose a context provider inside the views argument? Then we can convert the access plugin to always use the argument instead.

A: Because the group won't always be in the contextual filters. Imagine a scenario where you can publicly view articles, but if you're a member of whichever group the article belongs to, you can also see whatever annotations the editors made in a Views block in the sidebar. Here, the contextual filter would be the node ID, but the GroupPermission plugin would have to be configured with the "view annotations" group permission on the group provided by a custom context provider ArticleParentProvider or whatever.

So for that reason I've chosen to keep the work I did and combine it with the work from #37, as previously explained in #61.

So, even though nothing changes, I'm writing this down in case other people have the same idea and wonder why I didn't offload the GroupPermission plugin's group negotiation to the views argument entirely.

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

@graber and I were talking on Slack and I think we can incorporate the patch from #37 into the current MR.

Let's say you have a view where you want one (or a few) context provide(s) to be checked, but on certain pages you want said view to be used inside a block with a hard-coded group ID as the view argument. It would make sense that the view argument were evaluated before the context providers then.

So with that in mind, we should add a checkbox on the config form allowing people to opt into this behavior and explaining that, when checked, a manually provided view argument will always take precedence over the context providers. I'll try and work on that tomorrow.

It also occurred to me that, once this lands, we should also open up a follow-up to allow the same context provider selection in the contextual filter modal. It makes very little sense to be able to choose which context provider to use for the access check, but not the actual views argument.

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

Okay, works for both entities and groups now. Also tested with all relationship fields hidden.

So this needs:

  • Tests
  • Option from group type to "use wizard" removed + update hook
  • Option from relation type to "use wizard" removed + update hook

But so far it's looking very good and the code is a lot cleaner this way. No more temp store and wizard stuff.

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

Yeah that's what I also reckoned from the information I could find. I just felt I'd raise it in case my inexperience gave me false confidence and there was, in fact, an issue with the MR.

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

Left a comment. Mind you I am very new to fibers, but from what I understand we are manipulating fibers that we didn't start and that seems like it might cause a deadlock if the code that spawned said fiber isn't sturdy enough to deal with our suspension.

Going to leave at RTBC, but would be nice if my mind were put at ease :)

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

Ah yes, I was unaware of that feature. Thanks!

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

Okay the PoC now works for group forms.

Still need to check a bunch of stuff, but the gist of it is that whatever is configured in the relationship's form display will be shown inline in the group form. This gives some level of control over the whole thing. Probably should also hard-code language to whatever the parent form specified.

This does mean that any hook_form_FORM_ID_alter for the group relationship form will not apply here. That's an unfortunate drawback.

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

Thanks! For future reference, where can I find this output myself?

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

No wait, that seems to be a comment indicating what commands are being run. In that case I'm clueless as to why the pipeline is failing.

Production build 0.71.5 2024