- πΊπΈUnited States smustgrave
This issue is being reviewed by the kind folks in Slack, #needs-review-queue-initiative. We are working to keep the size of Needs Review queue [2700+ issues] to around 400 (1 month or less), following Review a patch or merge request β as a guide.
@jonathan1055 is correct. If an issue is being worked on in the MR it just adds noise to continue with patches. MRs are also easier to review.
At this time we will need a MR for D10 though addressing the open threads in the current. Recommend starting a new one vs updating the existing.
Also starting in #134 the patch size almost tripled.
- First commit to issue fork.
- @lucassc opened merge request.
- Status changed to Needs review
over 2 years ago 2:24pm 25 January 2023 - π§π·Brazil lucassc Rio de Janeiro
Hi! I opened a MR for 10.1.x, which was not available first from the "Create new branch" link at the top of the issue. So I followed this doc β to add 10.1.x as the basis for my issue branch 2950869-optimizing-revision-queries.
Not sure if this is the correct way, I'm novice.
Please, review.
- π§π·Brazil lucassc Rio de Janeiro
We had comments in the old MR that still need to be discussed.
core/lib/Drupal/Core/Entity/Query/Sql/Query.php:
// Add a self-join to the base revision table if we're querying only the // latest revisions.
@daffie said "This comment needs an update as it no longer reflects what we are doing."
core/tests/Drupal/KernelTests/Core/Entity/EntityQueryTest.php:
$results = $this->queryResults = $this->storage ->getQuery() ->latestRevision() ->notExists("$figures.color") ->accessCheck(TRUE) ->execute(); $expected = [16 => '4', 8 => '8', 20 => '12']; $this->assertSame($expected, $results); // Update an entity. $entity = EntityTestMulRev::load(4); $entity->setNewRevision(); $entity->$figures->color = 'red'; $entity->save(); $results = $this->queryResults = $this->storage ->getQuery() ->latestRevision() ->notExists("$figures.color") ->accessCheck(TRUE) ->execute(); $expected = [8 => '8', 20 => '12']; $this->assertSame($expected, $results);
@daffie said: "We need to do a change that keeps it in the result and therefore we see the changed revision id. The current change removes it from the result."
@neclimdul replied "I'm not sure what this is asking. What result would we be able to see the revision ID in?"
- π¬π§United Kingdom catch
Now that quickedit is no longer in core, this is less likely to cause timeouts during normal site operation - so moving back down to major, but of course we should still fix it.
- Status changed to Needs work
over 2 years ago 5:26pm 26 January 2023 - πΊπΈUnited States smustgrave
Removing reroll as @lucassc you seemed to get it correct.
But there does appear to be a test failure.
- π¬π·Greece magtak
I work for Acquia Support and I came across this issue while debugging a use case where all the revisionable entities (Site Studio layout_canvas entities) were set to be sequentially re-saved (Site Studio rebuild).
Looking at the query,
...LEFT OUTER JOIN "cohesion_layout_revision" "base_table_2" ON "base_table"."id" = "base_table_2"."id" AND "base_table"."revision" < "base_table_2"."revision"...
The LEFT OUTER JOIN and the '<' conditional operator will match rows proportionate to the number of revisions an ID has, this will create new rows in an exponential way.
For the case I was looking at in particular, a 10h+ operation that resulted in 1000+ second SQL queries at times which locked the revisions table and thus blocked any content editing, finished in about 35 minutes with the patch applied with barely any queries taking over 1 second. The items generating those 1000+ second queries were processed instantly.
I checked our fleet's logs and there are multiple sites impacted by this as I can see thousands of slow ( > 1s) queries generated by the code above.
Patch #133 worked for the client I was working on given their Drupal version (9.5.x).
While we will be advising clients to use patches from this thread if they encounter this issue, this fix should be high priority and should be backported to all affected Drupal branches (8, 9, 10).
- π§πͺBelgium Dries
Great to see we are getting closer to a solution.
In addition to improving the queries, it might not be a bad idea to show a warning on
/admin/reports/status
when there are entities with more than$number
revisions. At a minimum, we should inform a site administrator about this.As a next step, I would not be opposed to a new feature in core to prune revision after
$number
days/weeks/months/year/never. We should tackle that in a separate issue though. - π©πͺGermany Anybody Porta Westfalica
Hey @Dries, cool to have your feedback here! :)
+1 on that (while the limit should be configurable in config, at least in settings.php)I guess that should be combined with the work from: #2925532: Provide bulk action to delete all old revisions of an entity β
#53 β there please.So if you create or find a task for that revision prune in core, please link both. It would really make a lot of sense to at least have a full-featured Entity revisioning API in core. So contrib can implement more specific concepts, using a consistent API for all entity types. Like bulk deletion manually (checkboxes in revisions tab and as views bulk operation), by age, by limit, etc. if that's too specific for core.
Thanks :)
- heddn Nicaragua
As FYI, https://www.drupal.org/project/node_revision_delete β has this feature in contrib. The hardest part of removing revisions is when the site has multiple languages installed. The way revisions are stored w/ multiple languages is complicated to say the least. You can't just remove the last 5 revisions of a node. Because you'll loose a revision that is published as an alternative language.
- π¨π¦Canada joel_osc
Also as an FYI we are experiencing performance issues due to revisions and wanted to use the well-written node_revision_delete module, but found that not only was multilingual problematic, things got worse with content moderation and trying to ensure that you don't prune forward revisions. Also, we had a requirement to preserve all historic 'published' revisions in all languages. We have contributed this module: https://www.drupal.org/project/cm_revision_delete β to help meet these requirements. Cheers!
- π¬π·Greece magtak
I second @heddn 's concern about multilingual sites.
While working on this bug I used both- node_revision_delete and
- entity_reference_revisions (*)... to prune both node revisions ( (*) but also "orphaned entities" that I believe do *not* get deleted when the node revision does) and it did remove revisions in a way that wouldn't make sense for the governance of multilingual sites.
- π§πͺBelgium rp7
FWIW, I work on a high traffic Drupal installation & the DB CPU usage skyrocketed to 100% after applying the patch from https://git.drupalcode.org/project/drupal/-/merge_requests/715. It brought the DB (and the site) to a halt eventually. So be aware :-)
This was (a part of) the query causing the high CPU usage (for some reason the AWS RDS UI won't show me the whole query):
SELECT "base_table"."vid" AS "vid", "base_table"."nid" AS "nid" FROM "node_revision" "base_table" INNER JOIN "node" "node" ON "node"."nid" = "base_table"."nid" LEFT JOIN "content_moderation_state_field_data" "cmsfd" ON cmsfd.content_entity_type_id = 'node' AND cmsfd.content_entity_id = base_table.nid AND cmsfd.content_entity_revision_id = base_table.vid WHERE ("base_table"."vid" IN (SELECT MAX(base_table.vid) AS "expression" FROM "node_revision" "base_table" GROUP BY base_table.nid)) AND ("node"
The project has about ~500K node revisions. It's multilingual & uses content moderation.
- First commit to issue fork.
- Merge request !9523Issue #2950869: Entity queries querying the latest revision very slow with lots of revisions β (Closed) created by pavlosdan
- Status changed to Needs review
about 1 year ago 3:30am 18 September 2024 - π¨π¦Canada pavlosdan
Added merge request targeting 11.x and addressed the comments in #147.
- π§πͺBelgium wim leers Ghent π§πͺπͺπΊ
@rp7 in #159:
So probably there's an incompatibility that I haven't figured out yet. Just wanted to post this to inform people (which are doing custom stuff like above) to be cautious.
Any chance you did figure that out? π€ If you were able to narrow it down to your
hook_query_entity_reference_alter()
implementation, then we should be able to proceed with the current MR (with caution), but if the root cause was the MR, then we'd need to rethink it.Additional information would be very valuable! ππ
- πΊπΈUnited States smustgrave
Issue summary should be using the standard issue template so it's clear for reviewers
There are 3 MRs open which is to be reviewed?
It's really annoying, especially when using Scheduler Module.
So, does anyone have any experience to share? Which MR should be considered?
Any advice , please ? Thanks- πΊπΈUnited States smustgrave
There are multiple MRs up, not super clear which is to be reviewed.
Appears to still need an issue summary update too.
- πͺπΈSpain nicobot Granada
I encountered the same issue and initially created a duplicate here: Issue #3491274 π Improve performance of queries fetching latest node revisions by optimizing self-join logic in Entity\Query\Sql\Query::prepare() Active .
After investigating, I reached the same conclusion regarding updating
\Drupal\Core\Entity\Query\Sql\Query::prepare()
with the following implementation:// Add a self-join to the base revision table if we're querying only the // latest revisions. if ($this->latestRevision && $revision_field) { $this->sqlQuery->where("base_table.$revision_field = ( SELECT MAX($revision_field) FROM $base_table WHERE $id_field = base_table.$id_field )"); }
The merge request (MR) containing my changes is available here: MR #10437.
This change has been applied to a live site and has significantly improved query performance, reducing execution time from nearly 10 seconds to 0.007 seconds when working with nodes that have over 13,000 revisions.
However, I think the implementation in the proposed MR #9523 is cleaner and more maintainable. That said, I am uncertain about the connection between the primary performance issue and the changes in ContentEntityStorageBase.php in MR #9523.
It might be worth exploring whether these changes could be addressed in a separate issue, as their relationship to the main performance problem isn't entirely clear (to me). IMHO this could help avoid blocking progress on resolving the current issue.
Thank you all
- Status changed to Needs work
8 months ago 11:10am 5 February 2025 Has anyone updated the patch for Drupal 10.4.x?
2950869-139.patch π Entity queries querying the latest revision very slow with lots of revisions Needs work fails on 10.4.1.Thanks
- π¬π§United Kingdom alexpott πͺπΊπ
I'm hiding the old 10.x and earlier branches as they are not ever going to land and focussing on the 11.x branch.
- π¬π§United Kingdom alexpott πͺπΊπ
alexpott β changed the visibility of the branch 2950869-optimizing-revision-queries to hidden.
- π¬π§United Kingdom alexpott πͺπΊπ
alexpott β changed the visibility of the branch 10.1.x to hidden.
- π¬π§United Kingdom alexpott πͺπΊπ
alexpott β changed the visibility of the branch 2950869-entity-queries-querying to hidden.
- π¬π§United Kingdom alexpott πͺπΊπ
Given we can't do the optimisation in \Drupal\Core\Entity\ContentEntityStorageBase::getLatestRevisionId() because of workspaces. We can just do the better approach in \Drupal\Core\Entity\Query\Sql\Query::prepare()
Updated the issue summary to reflect this.
- π¬π§United Kingdom alexpott πͺπΊπ
Also given this affects param conversion and is used by workspaces I get the feeling this is actually a critical - we can also see the query in the performance tests.
- π¬π§United Kingdom alexpott πͺπΊπ
Editing a node with 4000 revisions... before the GET on node/N/edit takes 976ms afterwards 226ms... we also have autosave enabled... before the posts to autosave take 730ms and afterwards they take 50ms... YES nearly the entire time on an autosave is spent doing param conversion to get the latest revision ID... lolz...
In my opinion this is RTBC and given the code is @fabianx's and only the test is mine I think I'm in a good place to RTBC this.
- π¬π§United Kingdom alexpott πͺπΊπ
I opened a follow-up for π Entity query in \Drupal\Core\Entity\ContentEntityStorageBase::getLatestRevisionId() should use a sort and limit Active
- π¬π§United Kingdom catch
Opened π Revisit latest revision / ::getActive() param conversion behaviour Active too to revisit the logic that causes this to be called so often (and also causes various other problems).
Committed/pushed to 11.x and cherry-picked to 11.2.x, thanks!
- π¬π§United Kingdom alexpott πͺπΊπ
Going to do a 10.x backport - I think this is worth it.
- π¬π§United Kingdom alexpott πͺπΊπ
I've backported the fix to 10.x by cherry-picking the 11.x commit.
- π¬π§United Kingdom catch
I've gone ahead and backported this to 10.6.x and 10.5.x, was borderline on 10.5.x but there's literally no API change or behaviour change here it's just improving the performance of a very specific database query.
- π³πΏNew Zealand quietone
mondrake reported in Slack that testing is failing on HEAD for PostgreSQL and SQLite.
I git bisected with PostgreSQL and found that this issue caused the problem.
- π¬π§United Kingdom alexpott πͺπΊπ
Opened π Fix EntityQueryTest on SQLite and Postgres Active to fix Postgres and SQLite
- π¬π§United Kingdom alexpott πͺπΊπ
@quietone thanks for the bisect. Went ahead with π Fix EntityQueryTest on SQLite and Postgres Active so we can close this one again.
Automatically closed - issue fixed for 2 weeks with no activity.
- Merge request !13268#2950869: Entity queries querying the latest revision very slow with lots of revisions (10.5.3) β (Open) created by charlliequadros