- Issue created by @eelkeblok
- π³π±Netherlands eelkeblok Netherlands π³π±
Pushed some work in progress, not functional ATM.
- Status changed to Needs review
10 months ago 10:39am 11 March 2024 - π³π±Netherlands eelkeblok Netherlands π³π±
This refactors the field scan into a batch process, doing the fields we want to scan 1000 rows at a time.
I've combined the querying of the database to do all columns at once (it does now ask the field processing method whether it would like to scan the ID column as well, but that seems to be a small price to pay for efficiency, as it does return quickly because the ID is not a text column).
The progress reporting is a bit wonky, as it counts every entity type equally, as well as every field within each entity; the progress is calculated as a simple fraction of the total numbers. This means that an entity without any scannable fields counts as heavy as an entity with many scannable fields. In practice, this means it is quite choppy; it can make huge jumps when it gets a bunch of entities that have noting of interest, and then seem to get stuck for a while, when scanning a text field that has a lot of data (the percentage with a decimal position we added for the individual scan progress does help there). More accurate would be to find out which fields are scannable up front and see how many rows there are to scan, and then keep a grand total of scanned rows. Still, this is a huge improvement with my "site of interest", which has a lot of user generated content.
- π³π±Netherlands eelkeblok Netherlands π³π±
BTW, I don't think this is a must-have for 3.0, could easily wait for a 3.1.
-
smustgrave β
committed fc6cd236 on 3.0.x
Issue #3422990 by eelkeblok: Batchify and optimize field scan (dangerous...
-
smustgrave β
committed fc6cd236 on 3.0.x
- πΊπΈUnited States smustgrave
Tested locally and still appears to be functional. Thanks!
- Status changed to Fixed
7 months ago 7:39pm 29 May 2024 Automatically closed - issue fixed for 2 weeks with no activity.