Improve performance of the membership loader

Created on 19 July 2017, over 7 years ago
Updated 18 July 2024, 4 months ago

Problem/Motivation

The membership loader is used in all kinds of places. This means it could get called a lot. For instance, every time a permission is check for a group in Drupal\group\Entity\Group::hasPermission(), the membership load is used to check if the user is member of the group and which roles are added to the membership. Each permission check results in a query. This could add up pretty quick. Especially when using the subgroups patch in #2736233: Port Subgroup (ggroup) to the D8 version .

Proposed resolution

To improve performance I suggest 2 changes:

  • Statically cache the methods
  • Make load() method use the results of loadByUser(). When only checking 1 permission of 1 group, this is slower, but when you need to do a lot of permission check for a lot of groups, reusing improves the performance a lot.I think this trade-off is worth it.

Remaining tasks

Review and discuss..

User interface changes

-

API changes

-

Data model changes

-

📌 Task
Status

Fixed

Version

3.3

Component

Code

Created by

🇳🇱Netherlands seanB Netherlands

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Thinking of scrapping the membership loader altogether, see 📌 Deprecate the membership loader in favor of shared bundle classes Fixed

    We can try to incorporate this into the new static methods from that issue, though. But I'm thinking of using a memory cache so that we can use cache tags rather than manual clears. It's just safer that way as we have dedicated cache tags for when someone gets a new membership for instance.

  • Status changed to Needs work about 1 year ago
  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    I'd like this in the 3.3.0 release. We can try and bake this into the new GroupMembership shared bundle class.

  • 🇳🇱Netherlands seanB Netherlands

    Rerolled the patch for 2.x.

  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Gonna see if I can fix this in 3.3.x

  • 🇳🇱Netherlands seanB Netherlands

    Found a small issue with my latest patch, this should fix it!

  • Merge request !132Improve performance of the membership loader → (Merged) created by seanB
  • Status changed to Needs review 9 months ago
  • 🇳🇱Netherlands seanB Netherlands
  • Pipeline finished with Success
    9 months ago
    Total: 962s
    #106793
  • Status changed to Needs work 9 months ago
  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium
    1. +++ b/src/GroupMembershipLoader.php
      @@ -47,25 +61,63 @@ class GroupMembershipLoader implements GroupMembershipLoaderInterface {
      +    // Prime the cache with all group memberships via ::loadByUser().
      

      People could have a large amount of memberships, every membership incurring an entity load. I think it would be better to load the single membership here and rely on the fact that people use single loads sparingly.

    2. +++ b/src/GroupMembershipLoader.php
      @@ -47,25 +61,63 @@ class GroupMembershipLoader implements GroupMembershipLoaderInterface {
      +    return $this->userMemberships[$account->id()][$cache_id][$group->id()] ?? FALSE;
      

      Don't think this will work as we don't set the third index of this array anywhere. $group->id() is nowhere used in loadByUser()

    Adjusting the above in the MR I'm working on.

  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Cross-posted :/ Think you just addressed my 2nd point. Will take it from here.

  • Issue was unassigned.
  • Status changed to Needs review 9 months ago
  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    This is what I had in mind. Reviews welcome.

    Simplified the cache ID logic, we don't need md5 here IMO. Also swapped the dash out for a dot because role IDs can contain dashes and we could potentially get a cache ID collision if we use dashes.

    I'm not a fan of all the static stuff, but that's what you get with these bundle classes. Also not a fan of calling a cache clear three times, I think we need to switch to cache tags on a memory cache here. That way, we don't need any manual clears at all and it works even if people bypass Group::addMember() ad Group::removeMember().

  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    We used to have a filter argument on the storage loader methods. I removed those because it made caching the outcome really hard. Now I'm thinking if we should move all of this logic back to the storage, find a clever way to cache it there and then simply rely on the storage for doing the heavy lifting.

  • Status changed to Needs work 9 months ago
  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Hmm, gonna revert the move to the storage handler for a bit and think about this more.

  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Hmm, issues I see with this are when your conditions become invalid.

    E.g.: We cache all members of a group with certain roles. During the request we change one of these members' roles. Now we're screwed because we will keep returning said member from the cache as if they still had the roles we're querying for.

    If we were to use a proper cache, this might not be a problem because we could invalidate the entry by virtue of cache tags. The only downside is that we only promoted MemoryCache to a fully supported one in D10.3 ( 📌 Fix MemoryCache discovery and DX Needs review ), so any test relying on RefreshVariablesTrait won't work with whatever code relies on the MemoryCache.

    Currently thinking of using MemoryCache nonetheless and adding a @todo to properly declare it once D10.3 is out.

  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    We could even use a DB cache here and merely cache the GroupRelationship IDs. EntityStorageBase already makes sure we do not load an entity more than we need to, so we could then query our cache and call $storage->loadMultiple() on it. If an entity is saved anywhere, we will get a fresh copy without having to worry about invalidating our own caches.

    This will also keep the memory footprint down because we're only caching IDs rather than full entities. The only thing we have to make sure is that we really get our cache tags right so that we do serve stale results from the DB/static cache.

  • Status changed to Needs review 9 months ago
  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Pushed a version which uses a proper cache as described above. This cache stores the IDs and then we still call $storage->loadMultiple() on that, keeping in mind that the storage caches the actual entity objects.

    If this works, it could be a really elegant solution as we wouldn't have to manually clear anything.

  • 🇳🇱Netherlands seanB Netherlands

    Tomorrow is my day off, but I'll run the latest version through our tests to see if we can remove the custom cache resets. Thanks for working on this. The content overview was kind of slow taking over 10 sec for some users. With the patch it went down to like 4/5 sec, so pretty significant improvement.

  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Tests are still failing, will try to fix that tomorrow

  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Okay so this goes green now (will fix the unrelated phpstan issues in a separate issue).

    The beauty if this solution is that it caches the query results across requests, meaning that if your membership information doesn't change, we immediately know what your membership IDs are from a single cache get rather than having to go to the potentially humungous group_relationship_field_data table.

    One thing to keep an eye on is how many cache entries this will generate, we are currently using the default backend as the DB backend, but we might want to assign a custom one like we did for the memory cache.

  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    @catch hinted that for the single loads I could use a primed cache to reduce the amount of entity queries we're firing when warming the cache. Something that was already in the earlier patches, actually.

    Will see if I can adjust the MR with that change in mind.

  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Okay, all tests go green with the recent suggestion added to the MR. I'm gonna add one more comment to document why we choose to prime the cache with loadByUser rather than loadByGroup and call it a day. I'm feeling really optimistic about the impact this MR might have on performance.

  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium
  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Made a few changes:

    • Used a custom DB backend because on large websites this may take up a lot of data, so people can specifically turn off the persistent caching but not the memory caching.
    • Removed the individual cacheability, list cache tags should suffice here
  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Right, for me this patch is "ready". So if any of the reporters in this issue could verify and post results, that would be great. :)

  • Pipeline finished with Success
    8 months ago
    Total: 1107s
    #119198
  • Pipeline finished with Success
    8 months ago
    Total: 1161s
    #119199
  • Pipeline finished with Success
    8 months ago
    Total: 1165s
    #119197
  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Thinking of targeting the next release against D10.3 instead of D10.2. There will only be a few weeks at most after this release before 10.3 is out so may as well immediately benefit from the new code.

  • Pipeline finished with Success
    6 months ago
    Total: 982s
    #181208
  • Status changed to Fixed 6 months ago
  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Okay this looks good to me. I really was hoping for at least one of the people credited here to provide some results, but at one point I just have to accept that it may not happen and move on, hoping for the best.

    We have test coverage, so it really shouldn't break anything. Whether it drastically increased performance is left to be seen. Group has no tests using PerformanceTestTrait yet, so I'm shooting in the dark here. Might want to add one of those tests now that the minimum version is D10.3.

  • Pipeline finished with Skipped
    6 months ago
    #181282
  • Automatically closed - issue fixed for 2 weeks with no activity.

  • 🇨🇭Switzerland berdir Switzerland

    Follow-up: 🐛 Invalid service definition for cache services Needs review

Production build 0.71.5 2024