- Status changed to Needs review
about 1 year ago 5:06pm 14 November 2023 - π§πͺBelgium wim leers Ghent π§πͺπͺπΊ
Per @alexpott at https://drupal.slack.com/archives/C2THUBAVA/p1699953782042549?thread_ts=...
We should consider deprecating pecl yaml support
As an additional benefit: we'd be able to use
enum
s directly in our simple config & config entities' YAML representations: https://symfony.com/blog/new-in-symfony-6-2-improved-enum-support#enums-...Thoughts?
- π§πͺBelgium wim leers Ghent π§πͺπͺπΊ
A first example where this would benefit greatly: π Convert 3 LOCALE_TRANSLATION_OVERWRITE_* constants used in the UI to a backed enum Needs work .
- π¬π§United Kingdom alexpott πͺπΊπ
Also this means we can't use enums as arguments in the container. See https://git.drupalcode.org/project/drupal/-/jobs/353909 ...
- π¬π§United Kingdom alexpott πͺπΊπ
π Allow parsing and writing PHP constants and enums in YAML files Needs work is very interesting.
- π§πͺBelgium wim leers Ghent π§πͺπͺπΊ
This blocks the original/better approach in π Follow-up for #3361534: config validation errors can still occur for contrib modules, disrupting contrib Active and other issues like it.
- π¬π§United Kingdom catch
We originally added support for PECL YAML due to concerns with the performance of Symfony's parser, especially when compared with parsing JSON when that was a possibly config storage format. Symfony is generally parsing YAML on build, not runtime.
This was ten years ago though, but I think we need a comparison of something YAML-heavy like router rebuilding with and without the extension to see how they compare now.
- π«π·France andypost
FYI since this 10 years PECL yaml extension has no changes and the same for libyaml which still has no support for 1.2 https://github.com/yaml/libyaml/issues/20
- π«π·France andypost
Moreover 2 unit tests are broken when yaml extension enabled
- π§πͺBelgium wim leers Ghent π§πͺπͺπΊ
- https://klm.no-ip.org/blog/2021/10-06-pecls-yaml-way-faster-than-symfony... from ~2 years ago
- https://github.com/koenpunt/yaml-php-benchmark from 8 years ago is outdated, but https://github.com/koenpunt/yaml-php-benchmark/pull/2 from August 2023 updates it.
I ran the latter. This is with:
-
PHP 8.1.25, with PECL YAML 2.2.2 which uses libYAML 0.2.5</li> <li><code>symfony/yaml
6.0.>
Results of the benchmark, which parses 3 YAML files 1,000 times for each parser:
- π«π·France andypost
FYI last night the tarballs for new releases of PHP are published, I will test it today and tomorrow will roll out new images
- π§πͺBelgium wim leers Ghent π§πͺπͺπΊ
First result for each of the 3 files: ~0.2 seconds for Symfony versus ~0.05 seconds for PECL YAML, for the most complex YAML file. 4 times slower.
OTOH β¦ how often does Drupal parse 1,000 YAML files? π€
Modifying the benchmark to match Drupal's needs
Let's check with a good body of YAML files in Drupal core. Those in the Standard install profile:
$ cd yaml-php-benchmark/yaml $ mv ~/core/core/profiles/standard/config/install/*.yml . $ mv ~/core/core/profiles/standard/config/optional/*.yml . $ cd .. $ ls -al yaml | wc -l 94 # apply attached patch β which changes it from benchmarking individual files to ALL the files in the "yaml" directory
Results
$ php -f ./benchmark.php PECL YAML Total: 3.8904910087585 Average: 0.0038904910087585 Symfony YAML Total: 13.704869031906 Average: 0.013704869031906
Conclusion
- There is barely any variance now, because A) no results are shown for individual files, B) 94 YAML files instead of 3.
- 3.9 vs 13.7 seconds for parsing all Standard config 1,000 times β PECL YAML is 3.5 times faster.
- β¦ but in the real world that'd mean 3.9 milliseconds vs 13.7 milliseconds
This suggests to me that assuming a fairly recent machine (in terms of CPU + SSD) and PHP >=8.1 (the minimum for Drupal 10), the real-world performance difference is negligible.
- π¬π§United Kingdom alexpott πͺπΊπ
@Wim Leers it's not just config... which is not usually parsed during a cold cache... it's services / routing / menu links etc...
- π¬π§United Kingdom catch
Yeah the ones I'm concerned about are these, particularly on more complex sites with 200-400 modules installed.
- container rebuild
- router rebuild
- library definitions
- menu links
- menu tasks
- menu actionsIf you clear caches from the performance page, then you have to go through 100% of these before you get the next paint. iirc router rebuild happens on the POST request after the form submission, and the rest will happen on the subsequent GET request, but they nearly all block everything else and each other. Give or take menus though since they might happen within bigpipe placeholder replacement.
If you clear caches on a high traffic site, then the longer these take, the more chance of stampedes where multiple requests are trying to rebuild the same cache items before any one of them has finished.
If we take a site with 300 modules installed, then assume that every module has routes and services, and 50% of modules have at least one of libraries, menus, links or actions (so 2.5 YAML files per module overall), that is:
300 + 300 + 150 = 750 YAML files, plus a handful of extras for core.services.yml, core.libraries.yml etc.
For 400 modules 400 + 400 + 200 = 1000.
For 100 modules: 100 + 100 + 50 = 250.
Then if 94 YAML files leads to a 10 millisecond performance regression, we could assume that the reasonable worst case of 1000 YAML files leads to a ~100ms performance regression - this is enough to compound a stampede situation or to create noticeable drag on UI operations like clearing caches or installing new modules.
However even 100ms is not that bad compared to issues like π block_content block derivatives do not scale to thousands of block_content entities Needs work which also kick in during the same situations.
Autowiring should eventually mitigate some of container rebuilding (or at least shift it away from YAML), since there'll be less YAML to parse. The container is the least shiftable/most blocking thing, so that is good.
Routes as attributes will also shift some things away from YAML and that's probably where the bulk of the YAML verbosity is after services.
π Add a cache prewarm API and use it to distribute cache rebuids after cache clears / during stampedes Needs work would mitigate this a bit too.
So it still might be fine but don't think it's negligible.
- π¬π§United Kingdom alexpott πͺπΊπ
Also the reason this got pushed on again was enums... but it turns out we have a solution for PECL yaml for that ...so shrug and carry on might be okay here.
- π«π·France andypost
As yaml extension now added to 8.1-8.3 images let's see how it affect pipelines
- Status changed to Postponed
about 1 year ago 2:30pm 30 November 2023 - πΊπΈUnited States smustgrave
Is this actually postponed on pipeline rtesting? Didn't see an MR or patch to review.
Not sure the right status.
- π«π·France andypost
As there's no regressions (images using yaml extension last week) the next step is π Add hook_requirements() for PECL YAML parser Needs work
- π«π·France Mars0test
Hi,
We can consider that many complex sites use CDNs or caching services to enhance performance. Without them, maintaining high performance for a complex Drupal website can be challenging.
Another question: Is it normal to perform a cache-clear on a production environment? Also, is YAML parsing the most significant issue when it occurs? ^^"
"Maybe we can try to favor the Symfony YAML parser when it comes to parsing the services.yaml file and improve our management of service declarations, which is not entirely equivalent to what we can do in Symfony 6.
- Status changed to Needs work
11 months ago 3:21pm 15 January 2024 - π¬π§United Kingdom alexpott πͺπΊπ
I think given π Support Yaml::PARSE_CUSTOM_TAGS in \Drupal\Component\Serial::decodeization\YamlSymfony:: Needs work we should look at this again. I think all the important parts of YAML parsing use a FileCache. This means that if the regular cache is not available and the APC cache is primed with these files then YAML parsing does not happening. This means that it would be possible to warm these caches prior to a cache clear on production as FileCaches are not cleared during a cache clear.
I've checked and the following stuff uses FileCache.
- Service yaml parsing
- Route yaml parsing
- Info.yaml parsing
- Anything that uses \Drupal\Component\Discovery\YamlDiscovery (so plugins like permissions, actions etc..)
Note that config does not use FileCache - but this is okay because config is not read from files anymore during runtime - it is read for the config table.
Therefore I think the use-case to support PECL Yaml is tenuous and the complexity is not worth it. Plus it makes adoption of new features from Symfony Yaml (which is under more active development) harder.
- π«π·France andypost
Absence of file cache for config is separate issue, which is annoying only installing from config
- π¬π§United Kingdom alexpott πͺπΊπ
In order to not read Yaml during multiple cache rebuild from the UI with APCu installed and using standard we need to land both:
π InfoParser should use FileCache Active
π \Drupal\Core\Asset\LibraryDiscoveryParser should use FileCache to cache the parsed yaml ActiveWrt to π Drupal\Core\Config\FileStorage::getAllCollectionNamesHelper should use some caching Needs work - I think it is wrong to use file cache for something that will only ever be read once.
- π§πͺBelgium wim leers Ghent π§πͺπͺπΊ
@alexpott Very interesting. Reviewed both of those issues π
- π¬π§United Kingdom catch
Given we've run into issues more than a couple of times, I think we should go ahead here once we're confident that filecache is being used in all the places it should be.
- π¨π¦Canada Charlie ChX Negyesi πCanada
I really, really, really do not want to get involved in decision making but it seems this issue focuses on build time decoding.
However,
SerializationInterface::encode
can and is used run time. Dropping support for the vastly faster PECL emitter here is nothing short of disastrous. Please keep it. For my use case, keeping only YamlPecl::encode and throwing a not supported in YamlPecl::decode is a perfectly valid solution, cutting the maintenance burn down to practically nothing while keeping run time performance high. Thanks for your consideration. - π¬π§United Kingdom alexpott πͺπΊπ
@Ghost of Drupal Past FWIW \Drupal\Component\Serialization\Yaml::encode enforces all Yaml encoding via Symfony so if you are using either \Drupal\Component\Serialization\Yaml or \Drupal\Core\Serialization\Yaml then encoding is via the slower Symfony already. And if you want to use PECL for encoding you'd have to write your own serializer already...
- π«π·France andypost
Serialisation is weakly used even for igbinary it needs tons of hacks ala #2836091: Swap serialization in DatabaseBackendFactory β
Having parts overridable out of box is good to have, speed of parsing by pecl still beats all usetland implementations, so sensible places can use SF implementation (config export) but the rest can have ability to choose
- π¬π§United Kingdom alexpott πͺπΊπ
@andypost the drivers behind this are the we have needs for Symfony features on decode elsewhere see π Support Yaml::PARSE_CUSTOM_TAGS in \Drupal\Component\Serial::decodeization\YamlSymfony:: Needs work for example. I think that once we have file cache for all the Yaml decoding in core there is no reason to automatically swap to PECL YAML for decoding (if it is available) and if we don't do that then we can consider deprecating the YamlPecl class because we are not using it in core. The deprecation can be a follow-up issue we people can argue whether or not we should keep it.
- Status changed to Needs review
11 months ago 10:44am 18 January 2024 - π¬π§United Kingdom alexpott πͺπΊπ
Pushed an MR that disable the automatic switching to YamlPecl. FWIW we already know that this is not going to result in a massive slow down of test because of @Wim Leers excellent research above AND the fact that PECL YAML was not installed on PHP 8 for ages and when it was we didn't see a massive speed up.
Doing this issue will also allow us to close :
- π¦πΊAustralia kim.pepper πββοΈπ¦πΊSydney, Australia
Left a couple of comments.
- π¬π§United Kingdom alexpott πͺπΊπ
Thanks for the review @kim.pepper
- Status changed to Needs work
11 months ago 1:12am 19 January 2024 The Needs Review Queue Bot β tested this issue. It fails the Drupal core commit checks. Therefore, this issue status is now "Needs work".
This does not mean that the patch necessarily needs to be re-rolled or the MR rebased. Read the Issue Summary, the issue tags and the latest discussion here to determine what needs to be done.
Consult the Drupal Contributor Guide β to find step-by-step guides for working with issues.
- Status changed to Needs review
11 months ago 1:38am 19 January 2024 - π§πͺBelgium wim leers Ghent π§πͺπͺπΊ
Found only nits, so keeping at . I think this kind of low-level change with broad impact should be reviewed by multiple people, so not yet RTBC'ing.
- Status changed to Needs work
11 months ago 12:34pm 19 January 2024 The Needs Review Queue Bot β tested this issue. It no longer applies to Drupal core. Therefore, this issue status is now "Needs work".
This does not mean that the patch necessarily needs to be re-rolled or the MR rebased. Read the Issue Summary, the issue tags and the latest discussion here to determine what needs to be done.
Consult the Drupal Contributor Guide β to find step-by-step guides for working with issues.
- π¬π§United Kingdom longwave UK
MR needs rebasing, has some unrelated commits in it.
- π¦πΊAustralia kim.pepper πββοΈπ¦πΊSydney, Australia
Updated the IS with the decisions made in the comments.
- π¦πΊAustralia kim.pepper πββοΈπ¦πΊSydney, Australia
Updated the CR
- Status changed to Needs review
11 months ago 8:57am 22 January 2024 - π¬π§United Kingdom alexpott πͺπΊπ
Addressed @kim.pepper's review - thanks for all the issue management @kim.pepper.
- π«π·France andypost
@alexpott looks like deprecation with class_alias is shorter then common
- π«π·France andypost
As pipeline showing
\Drupal\Core\Serialization\Yaml
still has some usage in core, so maybe needs follow-up to deprecate it too- https://git.drupalcode.org/issue/drupal-3205480/-/pipelines/80716
- https://git.drupalcode.org/issue/drupal-3205480/-/pipelines/80720 - πΊπΈUnited States smustgrave
There a recommended way for testing this one?
- π¬π§United Kingdom alexpott πͺπΊπ
@smustgrave not really - Yaml reading is such an integral part of core that this code is called 1000s of time per test run. It's tested - the thing to check is the deprecations and does everything make sense.
- Status changed to RTBC
10 months ago 2:52pm 13 February 2024 - πΊπΈUnited States smustgrave
Based on that nothing appears to have broken as tests are all green
Verified the deprecation has a working/correct link to CR 3415489
First time seeing an entire class deprecated but makes sense.
I see test coverage for the deprecations too.
LGTM!
- Status changed to Needs review
10 months ago 4:44pm 13 February 2024 - π¬π§United Kingdom catch
#3205480-32: Drop PECL YAML library support in favor of only Symfony YAML β still needs an answer for me - are there any places we aren't using filecache but should be? I think @alexpott mentioned config schema in slack.
- π§πͺBelgium wim leers Ghent π§πͺπͺπΊ
Agreed with @catch, but the concrete example of config schema discovery is already okay, I think?
\Drupal\Core\Config\Schema\ConfigSchemaDiscovery
usesconfig.storage.schema
which is\Drupal\Core\Config\ExtensionInstallStorage
\Drupal\Core\Config\ExtensionInstallStorage
extends\Drupal\Core\Config\InstallStorage
which extends\Drupal\Core\Config\FileStorage
\Drupal\Core\Config\FileStorage
has been usingFileCache
since #2473179: Use FileCache for config storage β
There's many layers here, so I'd like someone to double-check π
- π¬π§United Kingdom alexpott πͺπΊπ
Re #60 the file cache usage in
\Drupal\Core\Config\FileStorage
is only in memory storage and not APCu cache. So if we read the schema twice in a request then we'd benefit - but not after a cache clear.I think this is okay as we don't actually read schema during a cache clear or regular use of your site. But I also think we should open an issue to add schema to the APCu cache.
- π¬π§United Kingdom alexpott πͺπΊπ
Opened π Use APCu file cache for config schema Active
- Status changed to RTBC
10 months ago 10:08am 15 February 2024 - π§πͺBelgium wim leers Ghent π§πͺπͺπΊ
AFAICT that means we can RTBC this.
I re-reviewed the entire MR. I have only one nit: the use of
assertEquals()
where we should be able to useassertSame()
. Not commit-blocking though. - Status changed to Needs work
9 months ago 11:18pm 2 March 2024 - Status changed to RTBC
9 months ago 3:24pm 3 March 2024 - π¬π§United Kingdom alexpott πͺπΊπ
Addressed @longwave's feedback as it was minor nits putting back to rtbc.
- π¬π§United Kingdom catch
This looks good now and unblocks some other issues.
Committed/pushed to 11.x and cherry-picked to 10.3.x, thanks!
- Status changed to Fixed
9 months ago 5:50pm 3 March 2024 - π§πͺBelgium wim leers Ghent π§πͺπͺπΊ
This unblocked π Allow parsing and writing PHP constants and enums in YAML files Needs work ! π
- π¦πΊAustralia kim.pepper πββοΈπ¦πΊSydney, Australia
Published the CR
- πΊπΈUnited States neclimdul Houston, TX
oh... sucks we don't have a standards compliant YAML parser anymore.
Automatically closed - issue fixed for 2 weeks with no activity.
- π¬π§United Kingdom rsaddington
Howdy,
We've been working with a customer who is experiencing issues with the new (and slower) Symfony YAML parser.
When importing large config files they are seeing things 4x slower in Drupal 10.3 vs. 10.2
Is there a way we can continue to make the old php::yaml_parser available in Drupal 10.3?
Perhaps as a config option or as a settings directive?