Make igbinary the default serializer if available, it saves 50% time on unserialize and memory footprint

Comment over 1 year ago →
🇬🇧United Kingdom catch
✨ Make serializer customizable for Cache\DatabaseBackend RTBC just landed so we're unblocked here.

However, I'm still stuck on how we'll deal with the issues from #2/#4. Could we maybe put this in $settings and write it out in the installer? That way, if you install on an environment with igbinary enabled, you would get that serializer, and it's up to you to then make sure that other environments your site gets migrated to also have igbinary available (which is not unlike a lot of other issues with php extensions). But then existing sites would not have things changed under their feet then.

But also, do we want to provide some kind of way to migrate?

Or instead of a settings flag, should we just bring igbinary module into core - but prevent installing it/uninstalling it on existing sites at least until a migration path is worked out?

Should we have a fallback serializer than can read PHP string serialization but only writes igbinary?
Comment over 1 year ago →
🇫🇷France andypost
If the module enabled then all caches should be cleaned at its install or dump will contain serialized with it.

So instead of container param it could be a call to moduleExists()
Comment 11 months ago →
🇫🇷France andypost
btw when cache is stored in APCu it can use igbinary without core https://www.php.net/manual/en/apcu.configuration.php#ini.apcu.serializer

So the new question is how to deal with this option when cache in chained-fast...
Comment 5 months ago →
🇨🇭Switzerland berdir Switzerland
Looking at the igbinary module, it contains a check if the returned value is igbinary: https://git.drupalcode.org/project/igbinary/-/blob/2.0.x/src/Component/S...

It won't work for other serializer implementations, but what if we introduce a new IgBinaryIfAvailableSerializer that on encode has a function exists check and on decode checks if the returned value is igbinary-encoded and and a function exists. The problem is if it's igbinary and the function doesn't exist. We might need to extend the interface with an isValid() option or something or allow for it to throw an exception.
Comment 5 months ago →
🇨🇭Switzerland berdir Switzerland
I wanted to review the possible benefits of using igbinary based on real world examples, including gz compression (both the redis module offers that as an option based on cache data size and the igbinary module does too as a separate serializer but then always.

Based on an umami demo install, I picked a few example cache entries, views_data, entity types, module list and some small ones and compared serialize, serialize with compression level 1, igbinary and also that with gz compression level 1,6 and 9. Both redis and igbinary default to 1, redis has it a a configurable setting. All of that in terms of size and speed of serialize and unserialize. For speed, I did run each operation 1000x times and then reported the total in ms (microtime() * 1000), absolute numbers aren't meant to be meaningful, just as a baseline for the relative speed. doing it 1000x seemed useful to even out random stuff, reported times seem to be vary +/- 10% (views_data:en unserialize was between ~90 and ~110). note that compression numbers are always *including* the respective serialize/unserialize call.

The script I used is attached. Results probably vary quite a bit between different systems, and I directly accessed the cache entries, so it relies on having warm caches. use select cid, length(data) as length from cache_default order by length asc; to get list of available cache entries and their size.

compact strings on: https://gist.githubusercontent.com/Berdir/e0bbdbf3922fdc9c8ae905fd80ac2d...
compact strings off: https://gist.githubusercontent.com/Berdir/c35f0007efba8c50cd896689290758...

This is on DDEV PHP 8.3, igbinary 3.2.16. WSL2 on Windows 10, i9-11900K @ 3.50GHz.

Takeways:
* reduction especially on large cache entries is massive. views_data on igbinary is only 18% of serialize. The compact strings setting is doing the heavy lifting here, it's 60% with that turned off. Combined with gz1, it's only 4% of the serialize size (gz1 on serialize is 9%). views data is extremely repetitive. For most other cache entries, it's around 30% with igbinary and 10% combined with gz1
* igbinary serialize speed is almost the same, up to 10% slower on most sizes with compact string. it's up to 30% faster with compact strings off, but the size benefit seem to outweigh this easily, see also unserialize performance next.
* compression is fairly expensive
* igbinary unserialize is not quite as much faster as the issue title claims, but it seems to be a fairly stable 6x% of unserialize() time on most sizes. it's actually slower with compact strings off, probably because there's a lot more data to work through?
* combined with uncompression, igbinary unserialize is about the same as serialize(). I haven't done any comparisons with how much we save in network/communication with the cache backend, but I would assume that igbinary + uncompression having the same speed as unserialize() seems like a huge win when we only have to transfer 4-10% of the data instead from redis and can essentially store up to 10x as much in the cache (1.5x as much maybe if already using the redis compression setting considering overhead existing compression size and overhead of hashes)
* didn't think too much about absolute numbers, but the compression cost increases a lot in relative numbers on small cache entries. around 8x (3x for unserialize) for that 1300 length ckeditor cache and and 25x (5x for unserialize) for that tiny 300 length locale cache. for comparison, empty/small/redirect render caches start around 250 length. Redis currently documents a length of 100 for the compress flag. I just picked that number fairly randomly and documented that people should do their own tests. I doubt any/many did. Anyway, I think that is clearly too low. maybe 1000? or higher? maybe someone who is better at math than me could calculate the network vs cpu overhead and what value makes sense.
* as expected, higher compression levels only result in minor improvements in size, with a massive cost in serialize speed (6 is 2x as slow as 1), so doesn't make sense to go higher. unserialize on higher compression level is actually slightly faster, but not enough to justify it I think.
* compression and uncompression is faster on igbinary than serialize, probably because the input string is already way shorter. for example views_data unserialize is 87 vs 32 overhead.
* When I was about to submit, I realized that having a mostly string cache, specifically page cache could also be interesting. Not for igbinary because serialize() and igbinary are pretty much identical there but for compression. So I added that as well (umami frontpage) and did rerun the script. compression in relative numbers looks very expensive there, but that's just because serialize of a very long string is very fast. the html too gets reduced to ~18% of the size, at a cost of 35 (compress) and 17 (uncompress). That's pretty consistent in regards to size with other caches, unsurprisingly.
Comment 5 months ago →
🇬🇧United Kingdom catch
6x% of unserialize() time on most sizes

Should this be 60%?
Comment 5 months ago →
🇨🇭Switzerland berdir Switzerland
Yes, I meant to say 60-70% with that x, edited to make that clearer.
Comment 5 months ago →
🇬🇧United Kingdom catch
Thanks that makes sense.

I would assume that igbinary + uncompression having the same speed as unserialize() seems like a huge win when we only have to transfer 4-20% of the data instead from redis and can store much more in the cache

I think this is very likely to be the case for the database cache backend too.

Also, when looking at memory issues I often see quite a lot from database queries and unserialize for large cache items. it's possible that uncompression means that would get transferred elsewhere but we might get lucky and it ends up a net reduction.

Also I wonder if it's worth looking into gzip compressing tags?
Comment 5 months ago →
🇨🇭Switzerland berdir Switzerland
FWIW, looking at blackfire data does show that in some cases, the cost of a gzuncompress is considerable, specifically on page cache hits:

Not sure but I would assume that blackfire doesn't add a huge overhead to a function call like that. That's 3.8ms. That's still with plain serialize, about to do compare that with igbinary and also without compression.

I've tried to do some testing with either blackfire or just plain ab on all combinations of serialize/igbinary/compression, but it's tricky, variations are too high between runs to really see a clear pattern. It's also not always that high I think.
Comment 5 months ago →
🇨🇭Switzerland berdir Switzerland
As expected, variation is too high for page cache to really compare in terms of speed/IO wait, without compression the response time was a bit lower, but it also claims to have less IO wait, which doesn't make sense obviously.

Network is probably the most reliable metric change:
Network+547 kB (+419%)
131 kB → 678 kB

Memory also went up a bit, but only 2%.

On a dynamic page cache hit, the gzuncompress is at 2% or so, so way less visible and comparing with and without compress gives me:

Network-1.37 MB (-86%)

1.58 MB → 215 kB

That's pretty neat.

One random but completely unrelated thing that I saw pop up is Drupal\help\HelpTopicTwigLoader, that adds all modules as extension folder and does an is_dir() check on them. that's enough to account for about 5ms in Blackfire and happens even on a dynamic page cache hit it seems. Will try to create a separate issue for that.

Last unrelated side note: If I'm seeing this correctly, then all the performance things I've been working on in redis and core (where we've included patches so far) resulted a reduction of Redis::hgetAll() calls from 54 to 27 in this project, and combined with the switch to igbinary, from 400kb network to 200kb. And I'm working on more, such as route preloading.
Comment 5 months ago →
🇫🇷France fgm Paris, France
I think there is one important issue here : is the Redis under test remote from the Drupal instance, or local to it ?

In most - if not all - enterprise setups I see doing audits, the Redis/Valkey servers are always either on the DB instances or completely standalone, not on the web servers (unlike memcached). This means that the impact of bandwidth reduction is much more relevant to overall performance in those setups than it is when benchmarking on a local instance.

These tests being run on DDEV make me suspect the measurements are for a local instance, however. Maybe it would be useful to try the same on two separate AWS/GCP/Azure instances instead ? Or even a container and something like Elasticache for Redis in the same AZ, which will likely not co-locate the Redis on the same instance.
Comment 5 months ago →
🇬🇧United Kingdom catch
Adding 🐛 Json and PHP serializers should throw excepiton on failure Active as related. Didn't review that issue properly yet but from the summary looks like we should already be catching and ignoring that exception in the cache backends.
Comment 5 months ago →
🇨🇭Switzerland berdir Switzerland
Re #27: Yes, redis being on the same server or not can make a huge difference. FWIW, all data in comment #27 is based on tests on a regular platform.sh project. The question I'm trying to answer/provide data for an answer in #25/#26 is whether core should use compression by default or not.

We also have a dedicated Gen2 project on platform.sh that uses multiple servers and would be more interesting for the impact/advantage of compression in such a scenario, but I don't have enough insights/blackfire there to be able to do that kind of profiling.

Either way, our default configuration probably shouldn't be optimized for that scenario at the cost of a less "enterprise" setup, that said, at the other end of the scale you have "classic" webhosting that often also has separate database servers.
Comment 5 months ago →
🇬🇧United Kingdom catch
For the database cache there's also the issue that cache tables can end up holding more data than the rest of the database itself. We did https://www.drupal.org/node/2891281 → but there was someone in slack actually trying to use that with what sounded like a medium-traffic site and running into lots of problems with the delete queries and similar. So if it's neutral or a very small regression with the database cache, it might be worth it anyway.
Comment 5 months ago →
🇫🇷France fgm Paris, France
I think at some point Platform tried to maximize co-locating containers in a project to close instance, from what DamZ told me long ago. I wonder if that is the case in these experiments.

Make igbinary the default serializer if available, it saves 50% time on unserialize and memory footprint

Problem/Motivation

Proposed resolution

Remaining tasks

User interface changes

API changes

Data model changes

Comments & Activities