Implement the new access policy API

Created on 25 July 2023, over 1 year ago

Part of 🌱 [Meta, Plan] Pitch-Burgh: Policy based access in core Active

After the API has been added to core, we need to implement it. We can do this by taking our current 2 access policies:

  • User 1 has full access, all the time
  • You gain access based on what permissions are part of the roles assigned to you

and convert them into SuperUserAccessPolicy and UserRolesAccessPolicy. Furthermore, we need to change the current centralized PermissionChecker service to call for the new API for access checks. The user.permissions cache context will also need to get its value (and inherently hash value) from the new API. We can copy and adjust code from the Group module's implementation to quickly achieve this.

Remaining tasks:

  • Change the PermissionChecker to use the new access policy API
  • Change the PermissionHashGenerator to mimic GroupPermissionHashGenerator
  • Change AccountPermissionsCacheContext to mimic GroupPermissionsCacheContext, namely using cacheable metadata from the hash generator
  • Write a UserRolesAccessPolicy (name pending) class that calculates permissions based on which user roles you have
  • Write a SuperUserAccessPolicy (name pending) class that grants all permissions to user ID 1
  • Check if all tests still go green and adjust where needed
📌 Task
Status

Active

Version

11.0 🔥

Component
Base 

Last updated about 15 hours ago

Created by

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

  • Issue created by @kristiaanvandeneynde
  • Status changed to Needs review over 1 year ago
  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    MR incoming with the implementation. Haven't adjusted any tests yet nor have I added tests for the 2 new policies.

    Note: Need to properly handle the new dependency and deprecation of 2 old dependencies in Drupal\Core\Session\PermissionsHashGenerator

  • last update over 1 year ago
    Custom Commands Failed
  • last update over 1 year ago
    Build Successful
  • last update over 1 year ago
    Build Successful
  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Hmm, I'll look into this on Tuesday, not sure why it says "Build complete" but I'll double-check the obvious stuff. Seems like a lot of FE tests are failing

  • last update over 1 year ago
    Build Successful
  • last update over 1 year ago
    29,830 pass, 30 fail
  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Well this is a success. Swapped out how we hand out permissions and only 30 tests fail. Will dive into the test fails.

  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Drupal\KernelTests\Core\Render\RenderCacheTest::testUser1PermissionContext

    Fails because it wrongfully tests side effects of user.permissions (more specifically the hash generator) and user.roles even though we have user.is_super_user for that. It tries to get a different render array to show up for UID 1 even if both UID 1 and Authenticated user have the same roles (none in this case).

    The reality is that this test is now pointless because the permission hash now looks the same if you have the same permissions as user 1, i.e.: if you have a user role that is flagged as admin.

    The other method on that test, ::testUser1RolesContext(), is still green but a really bad example of a test. We should never expect the name "user 1" to show up when varying by user.roles. We have "user.is_super_user" or "user" for that.

    Conclusion: This test should simply be removed as a whole.

    UserTest / UserSessionTest

    Fail on line 124 because it mocks the PermissionChecker service with the wrong dependency. I changed the dependency of said service because it hasn't been released yet (it was also added in 10.2.x).

    Conclusion: This should be an easy fix.

    PermissionsHashGeneratorTest

    This runs into the swapped out dependencies we have for PermissionsHashGenerator. I can fix this by changing the test and properly deprecating the old dependencies as seen in this example.

    Conclusion: This should be a relatively easy fix.

    UserAccountFormPasswordResetTest / MediaEmbedFilterDisabledIntegrationsTest / MediaEmbedFilterTest / MediaEmbedFilterTranslationTest /

    Fails because it sets the current_user service to an instance of User, even though it should be an instance of AccountProxyInterface. Now that permission checks inevitably cause the account switcher service to be fired up (it's a dependency of AccessPolicyProcessor), we get the error seen here.

    Conclusion: This should be an easy fix. Could be done in a separate issue to reduce noise in this MR.

    AccessDeniedTest

    I need to look into this one further

    Other fails

    Seems to be Rest/jsonapi related, can probably fix all of them once I find out why they're failing.

  • last update over 1 year ago
    29,840 pass, 21 fail
  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    I fixed UserSessionTest, yet the extending UserTest is still failing even though it doesn't change anything about the thing that's failing. Must be something to do with the mock classes it creates vs the actual classes the parent creates. I'm lost, will focus on other failures :/

  • last update over 1 year ago
    29,840 pass, 20 fail
  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Okay so all of the tests extending NodeResourceTestBase are failing in testPatchPath because that one hands out extra permissions to the "Authenticated user" role during the test.

    Normally, this would trigger the config:user.role.authenticated cache tag to be invalidated (and it does) and therefore clear the calculated permissions cache, because UserRolesAccessPolicy properly adds that tag to the CalculatedPermissions object. However, when inspecting CacheTagsChecksumTrait, it shows that the role's cache tag is already part of invalidatedTags and does therefore not invalidate again.

    I have no idea why this is happening, but manually clearing 'access_policies' makes the test go green, so the problem really is that Drupal is trying to clear the role's cache tag but does not get the desired result because it got cleared before and somehow wasn't removed from CacheTagsChecksumTrait::invalidatedTags when the role was saved again.

  • last update over 1 year ago
    29,904 pass, 14 fail
  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Talked about it on Slack with Berdir and he pointed out that this is exactly why we have refreshVariables() in browser tests. To reset the checksum thingy so that it doesn't get out of sync when multiple browser requests are fired during the same test. Should see some more tests go green now.

  • last update over 1 year ago
    29,922 pass, 11 fail
  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    MediaJsonCookieTest::testPost and other sibling classes are failing a bit weirdly. They are instantly getting to the 400 part in the code below, leading me to assume they are already authorized. Now the question is, why did this not fail before. Using the same trick as the other rest/jsonapi tests does not work here and that seems logical as we are actually trying _not_ to be authenticated here.

    All in all, I am seeing a lot of rest/jsonapi tests go red because of how they were written, not because of the changes I am making to core. Meh :/

    From EntityResourceTestBase.php

        // DX: 403 when unauthorized.
        $response = $this->request('POST', $url, $request_options);
        $this->assertResourceErrorResponse(403, $this->getExpectedUnauthorizedAccessMessage('POST'), $response);
    
        $this->setUpAuthorization('POST');
    
        // DX: 400 when no request body.
        $response = $this->request('POST', $url, $request_options);
        $this->assertResourceErrorResponse(400, 'No entity content received.', $response);
    
  • last update over 1 year ago
    29,947 pass, 7 fail
  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    These browser tests are such a pain :/ Now AccessDeniedTest is somehow getting a cached response where it shouldn't. It does not contain drupalSettings and therefore fails. If I invalidated the 'rendered' cache tag, it all works. Trying to figure out what is going wrong here.

  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Okay so I figured out why that dreadful test goes red:

      public function testAccessDenied() {
        $this->drupalGet('admin');
        $this->assertSession()->pageTextContains('Access denied');
        $this->assertSession()->statusCodeEquals(403);
    

    With both the old hash generator and the new one does this page not contain drupalSettings. Which is weird because when I reproduce this test in the browser by clicking around, I do see the drupalSettings being added to the footer of the page. This seems to only happen for anonymous sessions.

    // Ensure that users without permission are denied access and have the
        // correct path information in drupalSettings.
        $this->drupalLogin($this->createUser([]));
        $this->drupalGet('admin', ['query' => ['foo' => 'bar']]);
    

    With the old hash generator this page contains the drupalSettings object, with the old hash generator it does not. Because the old hash generator used to include the role names and I don't, the page gets a new hash (anonymous became authenticated) but the new hash generator still returns the same hash as both anon and auth have the same permissions.

    This leads me to conclude that whatever is causing drupalSettings to not show up for anon forgot to add the user.is_anonymous cache context. The fact that my hashes now (rightfully) collide leads to the auth user seeing the anon page, which does not contain drupalSettings.

    Now if anyone could point me in the right direction on why drupalSettings does not show up on the anon page? Can be reproduced by changing the first few lines of the test to:

        $this->drupalGet('admin');
        $this->assertSession()->pageTextContains('Access denied');
        $this->assertSession()->statusCodeEquals(403);
        $settings = $this->getDrupalSettings();
        // This breaks the same way, but for ALL hash generators (old and new).
        $this->assertEquals('admin', $settings['path']['currentPath']);
    

    This cost me almost 8 hours to debug already. What a nightmare.

  • last update over 1 year ago
    29,949 pass, 5 fail
  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    For anonymous users in tests, HtmlResponseAttachmentsProcessor::processAssetLibraries() hands an empty array to the JsCollectionRenderer and sets that to $variables['scripts_bottom']. When doing the same in a browser, it correctly contains drupal settings.

    This has now been proven to be a core bug that I do not have time to investigate. Will file a separate issue for this tomorrow, adding back the failing test code that I am removing from the failing test here.

  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Re 12/13, I reported it so we can definitely move this issue along: 🐛 system_page_attachments varies by authenticated user role but does not add said cache context Needs review

  • 19:14
    18:22
    Running
  • last update over 1 year ago
    29,968 pass, 1 fail
  • last update over 1 year ago
    29,972 pass
  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium
    1. Cleaned up user test to be an actual unit test rather than feeding data that the deeper implementation used to require. Now it just mocks the permission checker.
    2. Then redid the deprecation in the hash generator and rewrote that test.

    All that is left now is add tests for the individual policies.

  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    All green 🎉

  • last update over 1 year ago
    Custom Commands Failed
  • Status changed to Needs work over 1 year ago
  • The Needs Review Queue Bot tested this issue. It fails the Drupal core commit checks. Therefore, this issue status is now "Needs work".

    This does not mean that the patch needs to be re-rolled or the MR rebased. Read the Issue Summary, the issue tags and the latest discussion here to determine what needs to be done.

    Consult the Drupal Contributor Guide to find step-by-step guides for working with issues.

  • last update over 1 year ago
    29,985 pass
  • Status changed to Needs review over 1 year ago
  • last update over 1 year ago
    Custom Commands Failed
  • last update over 1 year ago
    29,989 pass
  • last update over 1 year ago
    29,994 pass
  • last update over 1 year ago
    30,007 pass
  • last update about 1 year ago
    30,094 pass
  • Status changed to RTBC about 1 year ago
  • 🇺🇸United States smustgrave

    Since 📌 Add the access policy API RTBC is in RTBC think this should be reviewed with it, could be wrong.

  • last update about 1 year ago
    30,182 pass
  • last update about 1 year ago
    30,182 pass
  • 59:48
    7:08
    Running
  • last update about 1 year ago
    30,190 pass
  • last update about 1 year ago
    30,197 pass
  • last update about 1 year ago
    30,204 pass
  • 29:47
    28:32
    Running
  • last update about 1 year ago
    30,241 pass
  • last update about 1 year ago
    30,241 pass
  • last update about 1 year ago
    30,396 pass
  • last update about 1 year ago
    30,396 pass
  • last update about 1 year ago
    30,397 pass
  • last update about 1 year ago
    30,396 pass
  • last update about 1 year ago
    30,407 pass
  • last update about 1 year ago
    Custom Commands Failed
  • last update about 1 year ago
    30,414 pass
  • last update about 1 year ago
    30,412 pass
  • last update about 1 year ago
    30,417 pass
  • last update about 1 year ago
    30,425 pass
  • last update about 1 year ago
    30,427 pass
  • last update about 1 year ago
    30,429 pass
  • last update about 1 year ago
    30,432 pass
  • last update about 1 year ago
    30,448 pass
  • last update about 1 year ago
    30,452 pass
  • last update about 1 year ago
    30,452 pass
  • last update about 1 year ago
    30,461 pass
  • last update about 1 year ago
    30,462 pass, 1 fail
  • last update about 1 year ago
    30,463 pass, 1 fail
  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    The only test failure here is after I added the "do not switch if user is current user" logic. Trying to figure out why only that single core test (UrlIntegrationTest) trips over this recent change. It goes green if I revert that single change.

  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Found it! In:

          if ($cache_context_root === 'user' && $this->currentUser->id() != $account->id()) {
    

    Current user ID was 0 and $account ID was NULL. I specifically had a non-strict check in there because I thought users returns strings as IDs, but apparently that's not the case. So I went ahead and made the check strict and that fixed it.

    However, this means we allow unsaved user's permissions to be calculated and the question is whether that is something we should allow. Arguably, UrlIntegrationTest should use the UserCreationTrait and properly create users that way.

    Testing access with an unsaved account is asking for trouble as caching per user permissions would also not work properly in that scenario, even without this whole new access policy API.

    Will push a fix, but please advise whether we:

    • Update UrlIntegrationTest to use UserCreationTrait
    • Update AccessPolicyProcessor to throw an exception or something if $account->id() is NULL
    • Do something else than the two above options
  • last update about 1 year ago
    30,464 pass
  • last update about 1 year ago
    30,464 pass
  • last update about 1 year ago
    30,464 pass
  • last update about 1 year ago
    30,471 pass
  • last update about 1 year ago
    30,471 pass
  • last update about 1 year ago
    30,472 pass
  • last update about 1 year ago
    30,472 pass
  • last update about 1 year ago
    30,476 pass
  • last update about 1 year ago
    30,494 pass
  • last update about 1 year ago
    30,510 pass
  • last update about 1 year ago
    30,521 pass
  • last update about 1 year ago
    30,524 pass
  • last update about 1 year ago
    30,524 pass
  • last update about 1 year ago
    30,526 pass
  • last update about 1 year ago
    30,550 pass
  • last update about 1 year ago
    30,557 pass
  • last update about 1 year ago
    30,568 pass
  • last update about 1 year ago
    30,590 pass
  • Status changed to Needs work about 1 year ago
  • 🇦🇺Australia larowlan 🇦🇺🏝.au GMT+10

    Left a review on the MR, Will be easier to review once the API issue is in, but given I'd just reviewed the other one, could tell what the changes were above that issue.

  • Status changed to Needs review about 1 year ago
  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Addressed all questions. Setting to NR rather than RTBC so that my answers can be evaluated first.

  • Status changed to RTBC about 1 year ago
  • 🇺🇸United States smustgrave

    Haven't been involved but floating around thanks to the follow feature.

    Believe all feedback has been addressed so reRTBC.

  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Merged a recent last-minute change that drops the new cache factory. Should still go green but let's see what the test results have to say.

  • Pipeline finished with Failed
    about 1 year ago
    #51179
  • Status changed to Needs work about 1 year ago
  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Setting to NW because the new cache approach fails in UserPermissionsTest but only because of how we test. So the implementation still works, it's just that our tests fail because of how we refresh variables. Will need to figure that one out here.

    Cross-posting from #3376843-57: Add the access policy API :

    UiHelperTrait::submitForm() and UiHelperTrait::drupalGet() both call $this->refreshVariables();, but if we look into that:

    protected function refreshVariables() {
        // Clear the tag cache.
        \Drupal::service('cache_tags.invalidator')->resetChecksums();
        foreach (Cache::getBins() as $backend) {
          if (is_callable([$backend, 'reset'])) {
            $backend->reset();
          }
        }
    
        \Drupal::service('config.factory')->reset();
        \Drupal::service('state')->resetCache();
      }
    

    It's clearly only resetting cache.bin tagged services, which is why we are seeing this testfail. If I manually add in this line:

    \Drupal::service('cache.access_policy_memory')->reset();
    

    Then the tests go green again.

    So it's purely a test fail because refreshVariables() doesn't pick up on cache tag invalidators that also support a reset() method.

  • 🇦🇺Australia larowlan 🇦🇺🏝.au GMT+10

    Needs reroll now the access policy API is in

  • Pipeline finished with Failed
    about 1 year ago
    #52720
  • Pipeline finished with Canceled
    about 1 year ago
    #52853
  • Pipeline finished with Failed
    about 1 year ago
    #52861
  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Opened 📌 Fix MemoryCache discovery and DX Needs review to fix what i wrote in #26 "the right way"

  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    🐛 system_page_attachments varies by authenticated user role but does not add said cache context Needs review goes green now. If we commit that along with 📌 Fix MemoryCache discovery and DX Needs review then this patch becomes a bit smaller and no longer has to adjust that many tests to work around the actual bugs. Will revert the bugfix for system_page_attachments() in this MR.

  • Pipeline finished with Failed
    about 1 year ago
    #53290
  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Was out sick for a week, but just replied to both blocker issues, minor as they are, so hopefully we can get those in soon and then this issue will be unblocked again.

  • Pipeline finished with Failed
    12 months ago
    #61110
  • Pipeline finished with Failed
    12 months ago
    #61112
  • Pipeline finished with Failed
    12 months ago
    #61111
  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    One blocker is now multi-RTBC and I addressed notes in the other, making that one NR again, and probably RTBC soon.

    Re my latest commit changing a typehint in the code that was added as part of the API, does that need a new issue or can we simply incorporate that in this patch as we're still on 10.3.x, which is the branch that added the new API to begin with?

  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    📌 Fix MemoryCache discovery and DX Needs review got committed, will try to adjust the MR to incorporate those changes tomorrow.

  • Pipeline finished with Failed
    11 months ago
    #65765
  • Pipeline finished with Failed
    11 months ago
    #65771
  • Pipeline finished with Failed
    11 months ago
    #65770
  • Pipeline finished with Failed
    11 months ago
    #65786
  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Okay so now only AccessDeniedTest fails, which will be solved by 🐛 system_page_attachments varies by authenticated user role but does not add said cache context Needs review

    It's also failing in the newly introduced StandardPerformanceTest, which is to be expected because it hard-codes how many cache gets are expected and that obviously changed with the work being done here. Results of those changes:

    • testAnonymous: Failed asserting that 128 is equal to 129 or is greater than 129. So we actually have one or more cache gets fewer.
    • testLogin: Failed asserting that 30 is identical to 28. So we actually have two cache gets more.
    • testLoginBlock: Failed asserting that 32 is identical to 30. So we actually have two cache gets more.

    Overall slight performance boost for anonymous, slight hit for login process. But that is merely based on reporting of this test.

    The fact that your permissions are now cached based on the cache contexts provided by the access policies could actually translate into an overall performance gain because we no longer need to load your user role entities on every request (to gather their permissions).

    Would it be fine to tweak the numbers in StandardPerformanceTest to match the new reality?

  • 🇺🇸United States partdigital

    kristiaanvandeneynde, I was attempting to try out the Access Policy API in Drupal 11 as part of an investigation into integrating it into the Access Policy module . Even though the service collector is defined in core.services.yml I noticed that addAccessPolicy is never actually called. It looks like this is because it's never retrieved by the container except in the automated tests. I assume that's what this issue is for?

    Is this API currently dormant in Drupal 11? If so, would it be worth mentioning that in the documentation This way developers don't go down a rabbit hole only to discover it's not ready?

    Unless of course I'm missing something entirely which in that case, open to any suggestions.

  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    @partdigital you're right that it's never called yet. That's what this issue is for :)

    Previous issue added the API, this issue is for making core use it.

  • Status changed to Needs review 11 months ago
  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Merged latest core now that all blockers are resolved and updated StandardPerformanceTest to match #33

  • Pipeline finished with Failed
    11 months ago
    #75041
  • Pipeline finished with Failed
    11 months ago
    #75042
  • Pipeline finished with Failed
    11 months ago
    #75052
  • Pipeline finished with Success
    11 months ago
    #75063
  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    All tests go green once more 🎉

    Here's a reviewer guide for all the changes:

    1. AccountPermissionsCacheContext: We now get the cacheable metadata from the hash generator as we do not know what the data the permissions were built with and therefore vary by.
    2. AccessPolicyBase: Changed typehint to match what's actually going on, this is an oversight from 📌 Add the access policy API RTBC Can do this because API got added in 10.3 and is not released yet
    3. AccessPolicyInterface: See 2.
    4. PermissionChecker: Now checks your permissions coming from access policies
    5. PermissionHashGenerator: Now hashes the calculated permissions
    6. PermissionHashGeneratorInterface: Includes new method for getting cacheability, follows 1:1 rule
    7. SuperUserAccessPolicy + test: New code, grants permissions for user 1
    8. UserRolesAccessPolicy + test: New code, grants permissions based on user roles
    9. CommentResourceTestBase, MediaResourceTestBase, MediaOverviewPageTest, NodeResourceTestBase: Need to call refreshVariables or certain cache tags do not get flushed due to how these tests span multiple requests.
    10. MediaEmbedFilterDisabledIntegrationsTest, MediaEmbedFilterTest, MediaEmbedFilterTestBase, UserAccountFormPasswordResetTest: Were incorrectly setting current_user service
    11. StandardPerformanceTest; Small changes reflected in test
    12. RenderCacheTest: Deleted because it no longer makes any sense, we can now have potentially millions of access policies aside from UID 1 being special
    13. AccessPolicyProcessorTest: See 2.
    14. PermissionCheckerTest: New code
    15. PermissionsHashGeneratorTest: Adjusted to match new code
    16. UserSessionTest: Adjusted to match new code
    17. core.services.yml: Adjusted for new services and dependencies

    Hope that helps

  • Pipeline finished with Canceled
    10 months ago
    #77995
  • Pipeline finished with Failed
    10 months ago
    #77997
  • Status changed to Needs work 10 months ago
  • 🇺🇸United States smustgrave

    So have not reviewed yet but see there are some failures in the MR. Seems to have got bit by 🐛 Spell-checking job fails with "Argument list too long" when too many files are changed Active

  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Merged in 11.x, which now has 📌 Separate cache operations from database queries in OpenTelemetry and assertions Fixed so we should now be able to get exact performance data. Curious to see how much difference there is. Also, will see if I can reduce the size a bit more.

  • Pipeline finished with Failed
    10 months ago
    #81733
  • Pipeline finished with Failed
    10 months ago
    #81734
  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    kristiaanvandeneynde changed the visibility of the branch 11.x to hidden.

  • Status changed to Needs review 10 months ago
  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Tried to apply the workaround from 🐛 Spell-checking job fails with "Argument list too long" when too many files are changed Active so we can run tests here.

  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Okay so new performance values are in:

    • OpenTelemetryAuthenticatedPerformanceTest::testFrontPageAuthenticatedWarmCache() query count goes up by 1
    • StandardPerformanceTest::testAnonymous() query count goes up by 1
    • StandardPerformanceTest::testLogin() query count goes up by 2
    • StandardPerformanceTest::testLoginBlock() query count goes up by 2

    Cache get count stays the same across the board

  • Pipeline finished with Failed
    10 months ago
    #82371
  • 🇬🇧United Kingdom catch

    @kristiaanvandeneynde do you know what the additional database queries are?

  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    The calculated permissions are stored in the DB using VariationCache so that we can actually have access policies (impossible without cache contexts). Before, we'd load your user roles and be done with it, which was always one query. Now we load your calculated permissions and, depending on how variable those are, follow a cache redirect or two to get to the final ones.

    So I'm almost 100% sure the extra queries are cache redirects being followed.

  • 🇬🇧United Kingdom catch

    Cache redirects being followed should only be cache gets/sets, except for cache tags though, so if there are different cache tags being requested it could be that.

  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Oh, right forgot the new performance testing now disregards queries coming from cache sets/gets. Could definitely be cache tags, yeah. If we want a definitive answer we'd have to compare the current output vs output when running this on ddev with your tools.

    Either way, the main difference between old code and new code is what I wrote in #44. Your permissions now come from potentially many sources and we need to gather them from said sources at least once until we cache it. At that point we have cache redirects to follow, depending on the complexity.

    So I'd start looking there to see what changed, but then again your tool might simply tell us which queries are gone and which are new.

  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Splitting off some low-hanging fruit to make this MR smaller, will update comment as I go:

  • Pipeline finished with Canceled
    10 months ago
    Total: 66s
    #82982
  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Hmm cache get count went down by one all of a sudden. Adjusted MR but this might spell trouble. Ignore the commit message mentioning query count, that was me not being properly awake yet.

  • Pipeline finished with Failed
    10 months ago
    Total: 525s
    #82983
  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    All 3 spin-off issues from #47 have a green MR and are easy to review and commit.

  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Okay so most recent performance values are in:

    • OpenTelemetryAuthenticatedPerformanceTest::testFrontPageAuthenticatedWarmCache()
      • Query count goes up by 1
      • Cache get count goes down by 1
    • StandardPerformanceTest::testAnonymous()
      • Query count goes up by 1
      • Cache get count goes down by 1
    • StandardPerformanceTest::testLogin()
      • Query count goes up by 2
      • Cache get count goes down by 1
    • StandardPerformanceTest::testLoginBlock()
      • Query count goes up by 2
      • Cache get count goes down by 2

    Not sure why this is different from #42 but I get the same values locally so I guess they're correct.

  • Pipeline finished with Failed
    10 months ago
    #83326
  • Pipeline finished with Failed
    10 months ago
    Total: 598s
    #84238
  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Only fail is because of 📌 Adjust StandardPerformanceTest to avoid random failures Needs review , currently working on comparing the telemtry data from pre-patch and post-patch to get an accurate description of the performance changes.

  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Okay so tests results are in.

    On my shitty laptop, performance was as follows:

    • StandardPerformanceTest::testAnonymous(), part 1: 2.32s -> 1.59s (31% faster)
    • StandardPerformanceTest::testAnonymous(), part 2: 446ms -> 419ms (6% faster)
    • StandardPerformanceTest::testAnonymous(), part 3: 319ms -> 245ms (23% faster)
    • StandardPerformanceTest::testLogin(): 227ms -> 226ms (same)
    • StandardPerformanceTest::testLoginBlock(): 255ms -> 332ms (30% slower)

    So performance for anonymous users went up across the board and was worse for one login test case out of two.

    Here is the difference on StandardPerformanceTest::testLoginBlock, because that one was slower.

    So if we reduce that to just the entries that are unique, we get (cache gets):

    1. get access_policy, CID "access_policies:drupal:[user.is_super_user]=0:[user.roles]=anonymous"
    2. get access_policy, CID "access_policies:drupal:[user.is_super_user]=0:[user.roles]=authenticated"

    Compared to:

    1. getMultiple config, CID "user.role.anonymous"
    2. get bootstrap, CID "user_permissions_hash:anonymous"
    3. getMultiple config, CID "user.role.authenticated"

    And query-wise:

    1. SELECT "tag", "invalidations" FROM "test51561594cachetags" WHERE "tag" IN ( "access_policies", "config:user.role.anonymous" )
    2. SELECT "name", "value" FROM "test51561594key_value" WHERE "name" IN ( "system.private_key" ) AND "collection" = "state"
    3. SELECT "tag", "invalidations" FROM "test51561594cachetags" WHERE "tag" IN ( "access_policies", :"config:user.role.authenticated" )

    Compared to:

    1. SELECT "tag", "invalidations" FROM "test99820655cachetags" WHERE "tag" IN ( "entity_types" )
    2. SELECT "tag", "invalidations" FROM "test99820655cachetags" WHERE "tag" IN ( "config:user.role.anonymous" )
  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Summary of the above: Seems like it's significantly faster on anonymous sessions and sometimes slower on authenticated ones.

    Also, not going to comb over the other result sets as it takes a long time to do so. But I can post the exports here for someone else to take a look at.

  • 🇬🇧United Kingdom catch

    OK that makes a lot of sense. The extra queries are for cache tags and state for the private key. Makes me wonder if we should split cache tags out from database queries too in the assertions so it's easier to see when that's the thing that changes.

  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Updated MR to what I think is right after merge.

    Also, the small issues from #47 are all RTBC now, so would be great if we could commit those to make the MR here smaller.

  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Makes me wonder if we should split cache tags out from database queries too in the assertions so it's easier to see when that's the thing that changes.

    I think that would be a great quality of life improvement.

  • Pipeline finished with Failed
    10 months ago
    Total: 553s
    #84746
  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Okay so kind of chasing performance tests now as they are sadly not as exact as we would like them to be. We'll fix this for sure, but in the meantime please focus on the actual code and ignore the performance tests counts.

  • Pipeline finished with Success
    10 months ago
    Total: 580s
    #84759
  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    2 out of 3 side quests from #47 got committed, so this MR just got smaller :)

    Merged in 11.x and I think I adjusted performance values correctly, will keep an eye on results. If we could now get 🐛 Call refreshVariables() where needed in various tests RTBC committed, then I think this MR would become roughly as small as it can be.

  • Pipeline finished with Failed
    10 months ago
    Total: 550s
    #89450
  • Pipeline finished with Failed
    10 months ago
    Total: 618s
    #89482
  • Pipeline finished with Failed
    10 months ago
    Total: 462s
    #89532
  • Pipeline finished with Success
    10 months ago
    Total: 506s
    #89541
  • Status changed to Needs work 9 months ago
  • 🇺🇸United States smustgrave

    Seems to need a manual rebase. Will keep an eye out though

  • Status changed to Needs review 9 months ago
  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Updated the MR, 🐛 Call refreshVariables() where needed in various tests RTBC is still pending commit to make the MR smaller. Also the performance impact as to what queries changed will be far more visible once we commit 📌 Log every individual query in performance tests Needs work .

    Still in favor of deleting RenderCacheTest as a whole because the user.permissions test case is no longer valid and the only other test case (user.roles) will soon become invalid too after a recent discussion we had where said cache context should not be returning a magic string for UID 1. Issue is pending for that one.

  • Pipeline finished with Failed
    9 months ago
    #101514
  • Pipeline finished with Failed
    9 months ago
    Total: 140s
    #101555
  • Pipeline finished with Failed
    9 months ago
    Total: 477s
    #101556
  • Pipeline finished with Failed
    9 months ago
    Total: 473s
    #101575
  • Status changed to Needs work 9 months ago
  • The Needs Review Queue Bot tested this issue. It no longer applies to Drupal core. Therefore, this issue status is now "Needs work".

    This does not mean that the patch necessarily needs to be re-rolled or the MR rebased. Read the Issue Summary, the issue tags and the latest discussion here to determine what needs to be done.

    Consult the Drupal Contributor Guide to find step-by-step guides for working with issues.

  • Pipeline finished with Canceled
    9 months ago
    Total: 268s
    #101614
  • Status changed to Needs review 9 months ago
  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Well yes mr. bot I was in the middle of things here...

  • Pipeline finished with Failed
    9 months ago
    Total: 535s
    #101620
  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Starting to wonder how much value there is in chasing the performance tests here until {#3421164] lands.

  • Pipeline finished with Failed
    9 months ago
    Total: 615s
    #101651
  • Pipeline finished with Success
    9 months ago
    Total: 468s
    #101672
  • Status changed to Needs work 9 months ago
  • The Needs Review Queue Bot tested this issue. It no longer applies to Drupal core. Therefore, this issue status is now "Needs work".

    This does not mean that the patch necessarily needs to be re-rolled or the MR rebased. Read the Issue Summary, the issue tags and the latest discussion here to determine what needs to be done.

    Consult the Drupal Contributor Guide to find step-by-step guides for working with issues.

  • Status changed to Needs review 9 months ago
  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium
  • Pipeline finished with Failed
    9 months ago
    Total: 607s
    #102342
  • Pipeline finished with Failed
    9 months ago
    Total: 478s
    #102390
  • Status changed to Needs work 9 months ago
  • The Needs Review Queue Bot tested this issue. It no longer applies to Drupal core. Therefore, this issue status is now "Needs work".

    This does not mean that the patch necessarily needs to be re-rolled or the MR rebased. Read the Issue Summary, the issue tags and the latest discussion here to determine what needs to be done.

    Consult the Drupal Contributor Guide to find step-by-step guides for working with issues.

  • Pipeline finished with Success
    9 months ago
    Total: 486s
    #102416
  • 🇫🇷France nod_ Lille

    bot is not happy since patch doesn't apply from the MR, can someone merge 11.x in the issue branch so it stops complaining? (for commit we still use the patch, not the MR so it's still important to have the MR working as a patch).

  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    I've been doing that for the past few days :D

    It keeps complaining because we're touching the performance tests here and both @catch and I have been hard at work optimizing those tests. So it keeps throwing merge conflicts here whenever one of those issues is committed.

  • Status changed to Needs review 9 months ago
  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium
  • Pipeline finished with Success
    9 months ago
    Total: 482s
    #102511
  • 🇫🇷France nod_ Lille

    ah yes sorry, you can always add the no-needs-review-bot tag to make him quiet if necessary :)

  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    That's a great tip, thanks. I'll remove it once we're fully done with the performance stuff.

  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    After merging in 📌 Log every individual query in performance tests Needs work locally, I've noticed the queries being added are really simple. It's always:

    'SELECT "name", "value" FROM "key_value" WHERE "name" IN ( "system.private_key" ) AND "collection" = "state"',

    Right after:

    'SELECT "name", "value" FROM "key_value" WHERE "name" IN ( "system.maintenance_mode" ) AND "collection" = "state"',

    With one exception in StandardPerformanceTest::testLoginBlock() where the above happens, but the same system.private_key is also added early on and right after:

    'SELECT "name", "value" FROM "key_value" WHERE "name" IN ( "views.view_route_names" ) AND "collection" = "state"',

    So every test gets one extra call to the key value store and one of them gets two.

  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    The above makes perfect sense because the PermissionsHashGenerator used to find the calculated hash as follows:

        $cid = "user_permissions_hash:$role_list";
        if ($static_cache = $this->static->get($cid)) {
          return $static_cache->data;
        }
        // Also tries DB cache below.
    

    But now, we do not store that hash in the DB anymore because we'd be repeating how it's stored in the AccessPolicyProcessor's cache. Meaning we always have to calculated a hash live at least once per request before we store it in the static cache. This is the extra query we're seeing for the private key and frankly is a slight false positive in a sense that on a cold cache we also need to request the private key at least once.

    The fact that it's happening twice with the login block might be because your account ID changes and the following static cache logic doesn't find anything for you then:

        $cid = 'permissions_hash_' . $account->id();
    
        // Retrieve the hash from the static cache if available.
        if ($static_cache = $this->static->get($cid)) {
          return $static_cache->data;
        }
    
  • 🇬🇧United Kingdom catch

    The login block test does both the POST request for the form submission and also the subsequent GET request after the redirect, so that explains the two queries there I think. The extra key value query will go away aftr 📌 Use cache collector for state Needs review so no concerns about that.

  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Individual query tracking issue got merged, so now you can see the changes for yourself here.

  • Pipeline finished with Success
    9 months ago
    Total: 575s
    #103997
  • 🇳🇿New Zealand quietone

    I was doing triage on 🐛 Call refreshVariables() where needed in various tests RTBC which I came to understand is a child of this issue from reading comment #47. I then added the other two issues mentioned in that comment as children. Then, I converted to the standard issue template so it is easier for more people to find information and know what still needs to be done here etc. And finally, I hide all the files.

  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Thanks @quietone, the only sidequest that still needs committing is the one you mentioned.

    That will reduce the MR here to:

    • 6 code file changes (4 changed, 2 new)
    • 8 test file changes (4 changed, 1 removed, 3 new)
    • 1 core.services.yml change

    Which is actually really small compared to how massively this changes the access layer in core.

  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    🐛 Call refreshVariables() where needed in various tests RTBC got committed, merged it in.

    The MR is a lot easier to review now than it was a few weeks ago.

  • Pipeline finished with Failed
    9 months ago
    Total: 100s
    #104191
  • Pipeline finished with Success
    9 months ago
    Total: 472s
    #104192
  • Pipeline finished with Failed
    9 months ago
    Total: 490s
    #116549
  • Status changed to Needs work 8 months ago
  • 🇺🇸United States smustgrave

    This looks good and hopefully can get into 10.3

    Left a few comments, all nitpicky stuff but will keep an eye out for this

  • Pipeline finished with Failed
    8 months ago
    Total: 130s
    #126070
  • Pipeline finished with Failed
    8 months ago
    Total: 174s
    #126071
  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Thanks so much for the reviews and changes @smustgrave and @larowlan.

    I've gone over the threads and have the following left:

    • Ping @catch regarding the method change on the permissions hash generator: DONE
    • Check the 2 test cases that don't test much: Left a comment, you can decide what we do here.
    • Write a kernel test for user roles access policy: TO DO, will do so next week
  • Pipeline finished with Success
    8 months ago
    #126076
  • Pipeline finished with Failed
    8 months ago
    Total: 127s
    #128447
  • Pipeline finished with Failed
    8 months ago
    Total: 222s
    #128450
  • Pipeline finished with Success
    8 months ago
    Total: 655s
    #128467
  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Added change record here: https://www.drupal.org/node/3435842
    All that remains now is the extra kernel test, on it now.

  • Pipeline finished with Success
    8 months ago
    Total: 527s
    #128486
  • Pipeline finished with Success
    8 months ago
    Total: 496s
    #128502
  • Status changed to Needs review 8 months ago
  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    All threads addressed, tests still go green.

  • Status changed to RTBC 8 months ago
  • 🇺🇸United States smustgrave

    All feedback appears to be addressed. Did see the slack thread and didn't seem like the tests were a stopper for 10.3

  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    That's awesome, thanks for the review 🎉

  • Status changed to Needs review 8 months ago
  • 🇦🇺Australia larowlan 🇦🇺🏝.au GMT+10

    I announced my intention to commit this to other committers and was taking one last pass when I realised that there were two tests in RenderCacheTest, one for user.permissions (which is redundant now) but another for user.roles.

    I've reinstated that tests with adjustments for the new code here. I also merged 11.x. Can you review and confirm that my changes to reinstate that test make sense. Fine to self RTBC if so.

  • 🇦🇺Australia larowlan 🇦🇺🏝.au GMT+10

    Issue credits

  • Status changed to RTBC 8 months ago
  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    From #60:

    Still in favor of deleting RenderCacheTest as a whole because the user.permissions test case is no longer valid and the only other test case (user.roles) will soon become invalid too after a recent discussion we had where said cache context should not be returning a magic string for UID 1. Issue is pending for that one.

    The reason I mentioned it would be removed soon is because, and I can now say this, the user roles cache context has a security issue. It was, however, deemed minor enough that a public follow-up could be created. In essence, checking for user 1 in the roles cache context is a relic that has no right to be there. Because this issue fixes the same thing for the user.permissions cache context, I thought outright deleting RenderCacheTest would be fine.

    But you're right that this is perhaps not the place and we should delete the rest of the test in that soon-to-be-created public follow-up.

    Your test changes look good to me and as discussed on Slack that warrants setting it back to RTBC. Thanks for all the reviews!

  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    For posterity, this is the public security issue I was talking about: 🐛 UserRolesCacheContext can lead to poisoned cache returns for user 1 Active

    • larowlan committed 9bb46296 on 10.3.x
      Issue #3376846 by kristiaanvandeneynde, larowlan, smustgrave, catch:...
    • larowlan committed 8a2a2fe2 on 11.x
      Issue #3376846 by kristiaanvandeneynde, larowlan, smustgrave, catch:...
  • 🇦🇺Australia larowlan 🇦🇺🏝.au GMT+10

    Committed to 11.x and backported to 10.3.x🎉

    Published the change records.

    Congrats @kristiaanvandeneynde on a mammoth effort, its been a pleasure working with you and I look forward to further collaboration in your new role as User subsystem maintainer.

  • Status changed to Fixed 8 months ago
  • 🇦🇺Australia larowlan 🇦🇺🏝.au GMT+10

    Added release notes snippet, please review

  • 🇭🇺Hungary Balu Ertl Budapest 🇪🇺

    Woo-hoo! 🎉🎊

    Thank you @kristiaanvandeneynde, @larowlan and all contributors for their hard work on this era-shifting feature.

  • 🇦🇺Australia larowlan 🇦🇺🏝.au GMT+10
  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    This is awesome 🚀

    Thanks for the many reviews to everyone involved!

    I've update the release note snippet to indicate user roles are also an access policy.

  • Automatically closed - issue fixed for 2 weeks with no activity.

  • Issue was unassigned.
  • 🇺🇸United States DamienMcKenna NH, USA
  • 🇮🇳India nitesh624 Ranchi, India

    This changes caused out Functional test to failed
    More details on https://www.drupal.org/project/drupal/issues/3463722 🐛 ConfigFactoryOverrideInterface::loadOverrides not invoked after content type creation in Drupal 10.3.0+ Active

  • 🇷🇴Romania claudiu.cristea Arad 🇷🇴

    Created 📌 What happens with user.is_super_user cache context? Active as a followup to decide the fate of user.is_super_user cache context

  • 🇺🇸United States danflanagan8 St. Louis, US

    This broke a site where we extend PermissionsHashGenerator.

    When calling parent::__construct($private_key, $cache, $static, $entity_type_manager); it goes WSOD because the PermissionsHashGenerator constructor got changed to accept three arguments instead of four. Beyond that, the `$cache` property is removed from PermissionsHashGenerator. Isn't that pretty big BC no-no kind of stuff?

    I see there was a conversation about BC breaks (https://git.drupalcode.org/project/drupal/-/merge_requests/4494#note_284734) in contrib in at least on context, but I don't see any conversation around several other potential BC problems.

  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    The changes followed the BC dance:

    1. We used to have __construct(A $a, B $b, C $c, ?D $d = NULL)
    2. Now we have __construct(A $a, B $b, C|E $e)
    3. , where extra code runs in case someone is still providing ABC or ABCD

    Obviously in D11, the BC layer has been removed, but in D10.3, you should be fine.

  • 🇺🇸United States danflanagan8 St. Louis, US

    Thanks, @kristiaanvandeneynde.

    Am I really allowed to pass four arguments to a constructor that accepts only 3 arguments? We were passing the EntityTypeManager as the fourth argument rather than relying on the already existing BC layer for that fourth argument (lots of work on this class lately!).

    I'm still curious about removing a property from the class:

    Beyond that, the `$cache` property is removed from PermissionsHashGenerator. Isn't that pretty big BC no-no kind of stuff?

    My extension of PermissionsHashGenerator was using that property and therefore completely broke.

  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    In php, you can pass as many arguments to a function even if it has only a few parameters. This is where some of this func_get_args() voodoo comes from in functions that accept any amount of arguments.

    As to the $cache property, that's a tough one. Technically it's a BC break and at the very least we should have added a second trigger_error with a "there is no replacement" type of message for $cache like we did in doGenerate().

    We could have kept assigning a cache backend to the $cache property, but then the question arises: "What's the point?" The actual class you were extending no longer makes use of it, so does it make sense for the extending class to still make calls to it?

    Either way, this seems like an oversight and I apologize for the inconvenience. Personally speaking, I think this shows why we should have more classes in core marked as internal and/or final. One could argue the permission hash generator should not be considered API, as it's only consumed by one cache context and sent to the frontend for external caches.

    I'd rather make it internal then and if people really want to change the outcome, encourage them to decorate it instead. This would retain our freedom to completely overhaul the internal hashing logic (like we did here), without risking people ending up in a situation like yours.

  • 🇺🇸United States danflanagan8 St. Louis, US

    I think you nailed it, @kristiaanvandeneynde:

    I think this shows why we should have more classes in core marked as internal and/or final.

    Agreed! I was kind of surprised wasn't internal.

    Thanks for your comments to me and your work on the issue. Cheers!

Production build 0.71.5 2024