- Issue created by @larowlan
- πΊπΈUnited States effulgentsia
wim leers β credited effulgentsia β .
- π¬π§United Kingdom longwave UK
wim leers β credited longwave β .
- π§πͺBelgium wim leers Ghent π§πͺπͺπΊ
Actually, I think that @larowlan intended one hash per component instance, not one hash per explicit input per component instance. That probably makes more sense π
Prior art at @effulgentsia's & @longwave's β¨ Add way to "intern" large field item values to reduce database size by 10x to 100x for sites with many entity revisions and/or languages Active . The difference:
- #3469082 would do it for the entire component tree (i.e. N component instances' explicit inputs)
- this would do it for a single component instance (i.e. 1 component instance's explicit inputs)
It's probably still worth doing, and would actually tie in nicely with π [later phase] When the field type for a PropShape changes, the Content Creator must be able to upgrade Postponed .
- π¬π§United Kingdom catch
While this is interesting to explore, I'm not sure the final implementation should be in XB itself - we have at least two other options:
1. Having this as an option for any longtext/json field in sql storage - the approach would be applicable to long body fields too (think issue summaries with 300 revisions where the text itself is only updated 5 times).
2. Other approaches for reducing revision table size like purging - e.g. purge all non-default revisions prior to the previous default revision (somewhat implemented in workspaces or workspaces extra iirc). Or purge default revisions with a decay (keep the most recent ten, then purge ever other revision, then purge every 9/10 revisions based on thresholds etc.). This could be done via putting the entity into a queue when it's saved with a new revision, the queue would then thin out the older revisions. There is probably already a core issue for this around but can't find it immediately.
A big reason to do #2 would be because it's not always only the size of the table on disk that's the problem, but if there are millions or hundreds of thousands of rows, just things like indexes on revision IDs etc can get huge too, increases memory requirements, writes can slow down, allRevisions() queries get slower.