- Issue created by @berdir
- 🇨🇦Canada adriancid Montreal, Canada
Hi @berdir, thanks for this issue, I'm not having much time these days and I see you're very active in the issue queue, do you want to become a maintainer?
- 🇨🇭Switzerland berdir Switzerland
I was actually considering that recently, but then saw that you pushed a new release. I did a bit of a review and testing around updating 1.x to 2.x and just wanted to write down my findings.
Feel free to add me, so there's another person around if a release is necessary or so, but I can't really promise much beyond that. I might or might not work on this issue (and others), depending on how much it's a problem for us, haven't yet tested where the limits are.
- First commit to issue fork.
- 🇫🇮Finland onnia
Hi,
My merge request does two things: The nodeExistsInQueue check is replaced with array of all node ids and a isset check, this method cuts the run time from tens of minutes to seconds. Second thing is the batch creation uses chunks of node ids that are processed. The chuck size can be set via the queue_chunk_size config. I also looked into updating drush queue adding with 520000 nodes, the drush (time drush node-revision-delete:queue
) adding takes 1min. I still have to commit my fix for the drush command. - 🇨🇭Switzerland berdir Switzerland
What's the memory usage with that amount of content? drush -vvv should report that.
My idea was something like https://git.drupalcode.org/project/tmgmt/-/blob/8.x-1.x/sources/content/..., with a sandbox and a query that progresses through it. but that would of course be a lot slower.
It's a bit awkward that the batch methods are on an interface, that technically makes this an API break, my method would too. Seems like something like that should be an internal implementation, but tricky to step back from that now, the module is stable.
- 🇫🇮Finland onnia
I did some memory debugging with this helper: https://blog.riff.org/2016_08_04_how_to_display_time_and_memory_use_for_drush_commands
When the 587k node ids are already in the queue and the$nids_in_queue
is at largestMemory: Initial: malloc = 21.60M real = 22.25M Final: malloc = 22.58M (+ 0.99M) real = 49.25M (+27.00M) Peak: malloc = 80.03M (+58.43M) real = 96.25M (+74.00M)
Yes, the memory usage peaks. About the api changes: I did not consider the breaking changes when editing the BatchInterface.
- 🇫🇮Finland onnia
If the edits to the
src/NodeRevisionDeleteBatchInterface.php
are an issue, then the MR could be altered to create a custom method for the batch queue which processes the$nids_chunk
. This newqueueChunk($nids_chunk)
method could be used when the count of processednids_to_add
gets larger then eg 1000? The new method could be added here -> https://git.drupalcode.org/project/node_revision_delete/-/merge_requests...