- Issue created by @kingdutch
- Merge request !6939Bump PHPStan to level 9 and accept a large baseline → (Open) created by kingdutch
- Status changed to Needs review
10 months ago 12:48pm 6 March 2024 - 🇬🇧United Kingdom longwave UK
I worry here that just jumping straight to level 9 will mean contributors have more work to do when submitting MRs against existing code, sometimes we have to do things for backward compatibility reasons and I think satisfying PHPStan is not always possible without breaking BC? In turn that means having to make decisions whether or not to explicitly add more things to the baseline.
- 🇮🇹Italy mondrake 🇮🇹
-1 (for now)
It will simply make impossible to contribute new code, since you'll have to develop huge workarounds to get sorted all the info which is missing from core codebase at the moment (description of iterable types, narrowing of types, etc etc).
An average contrib module can get to L5 relatively simple, L6+ is more work in satisfying PHPStan than in anything else.
- 🇳🇱Netherlands kingdutch
It will simply make impossible to contribute new code, since you'll have to develop huge workarounds to get sorted all the info which is missing from core codebase at the moment (description of iterable types, narrowing of types, etc etc).
I'd want to respectfully disagree. I definitely think there's a learning curve. We'll have to go through that as a community at some point, but if we do our lives will become easier. Better documentation on how to solve common type issues can help there and I'm willing to help write that if it can help speed up Drupal's move to strict-typing.
One habit that I see with developers is the impulse to add
/** @var */
annotations to solve typing warnings when upstream types are not yet set. That is indeed something that becomes harder to maintain. A much better way of solving that is to useassert
instead. It has the benefit of ensuring that your tests fail if what you're saying is false and PHPStan will let you know when the upstream types have improved and theassert
is no longer needed.If iterable types is truly a problem then we can disable that behaviour specifically with
checkMissingIterableValueType: false
but having the iterable types specified makes yourforeach
loops so much nicer and gives you type completion without having to do extra type annotations or asserts within your loop.An average contrib module can get to L5 relatively simple, L6+ is more work in satisfying PHPStan than in anything else.
As a counter-example, I would consider the Open Social distribution a non-trivial codebase with about 69 top-level modules (and a whole bunch of nested modules). We've been running at level 8 for three years and counting and our code has become significantly higher quality with a measurable reduction in bugs as a result.
- 🇳🇱Netherlands kingdutch
I worry here that just jumping straight to level 9 will mean contributors have more work to do when submitting MRs against existing code [...] In turn that means having to make decisions whether or not to explicitly add more things to the baseline.
I think that's a fair worry, specifically because they'll have to think about types in a way that we don't have to do now though. It's a trade-off of short-term pain for long term gain. The work they'd have to do now would otherwise be work we'd have to do at some point in the future.
The baseline is also very good at ensuring that these kinds of changes can happen locally without having to go and fix the whole file just because you're changing a single line.
sometimes we have to do things for backward compatibility reasons and I think satisfying PHPStan is not always possible without breaking BC?
I'd be curious to see examples of this and whether I can offer ways around it! In general PHPStan allows a lot of ways to add type information without actually changing the functioning of the code (e.g. through
assert
or@annotations
).What might happen is that Drupal becomes better at typing its code which will cause downstream projects to suddenly see PHPStan errors where there were none before. However, I personally don't consider that a breaking change because it's clarifying intention. As long as the PHP code keeps working without errors and doing what it promises then that would still be backwards compatible.
- 🇺🇸United States mglaman WI, USA
If we're worried about size, can't we add the baseline as an ignore in https://git.drupalcode.org/project/drupal/-/blob/11.x/.gitattributes ?
- 🇦🇺Australia alex.skrypnyk Melbourne
I really like this approach to working with the baseline config: there are 486 exclusions in the baseline config, but they are all laid out nicely in a single file. This means that those items can be "sliced and diced" into standalone tickets and tackled by the community. At the same time, all the new code will be coming at the level 9.
- 🇦🇺Australia acbramley
Could we target something slightly lower at first?
My main concern with a very large baseline is not performance, but conflicts in the baseline file - making rerolls tedious. Conflicts in baselines are sometimes hard to know how to resolve especially for non-senior developers. This usually results in having to push follow up fixes to MRs creating more noise/CI minutes/$$$
- 🇦🇺Australia mstrelan
I'm keen for something like this but agree with #10 that level 9 might be a bit ambitious. I'm thinking something more like level 4-6. For example, level 4 would have prevented 🐛 Fix @param docs for $deriver on plugin attribute classes Needs review .
- 🇦🇺Australia dpi Perth, Australia
For me, 6 is always the minimum for projects I'm introducing PHPStan to. I've found it has most value, without getting in the way. Though for new projects, I agree 9/max is fine.
9 seems nuts, especially maintaining a baseline.
When we do end up going there, I'd want us to consider bringing in
phpstan/phpstan-strict-rules
And at least having configs like:
reportUnmatchedIgnoredErrors: true checkMissingIterableValueType: false checkGenericClassInNonGenericObjectType: false
So we can have a manageable baseline.
Stepping stones
- 🇺🇸United States mglaman WI, USA
Via Slack I had mentioned we should use
checkMissingIterableValueType: false
. I'm just bummed it kills analysis for all iterables and not just arrays :/But I guess this is OK because all of the phpdoc enhancements for that live in phpstan-drupal right now. So eventually they can be moved here once the level is bumped and they're actually analyzed.
🤔 Which made me realize that Drupal core is scanning itself and having it's phpdoc overridden by phpstan-drupal for defining iterables
- 🇳🇱Netherlands kingdutch
I'm skipping #8 for now until we know what the baseline will actually look like. At that point we can decide whether it needs to be excluded from generated archives. I think it's a good proposal and an easy addition though.
Replying to #10, #11, and #12, paraphrased:
What about merge conflicts in the baseline? They're tedious
I think there's two things I could bring up to hopefully alleviate this concern.
Firstly, I think the decision to use PHP for the baseline rather than NEON will help git reduce the number of conflicts. The NEON baseline is basically a giant list of YAML which I believe git treats as plaintext which often causes it to not know which array item marker (
-
) belongs to which rule. Example excerpt from Open Social below.parameters: ignoreErrors: - message: "#^Method Drupal\\\\activity_basics\\\\Plugin\\\\ActivityAction\\\\CreateActivityAction\\:\\:create\\(\\) has no return type specified\\.$#" count: 1 path: modules/custom/activity_basics/src/Plugin/ActivityAction/CreateActivityAction.php - message: "#^Method Drupal\\\\activity_basics\\\\Plugin\\\\ActivityAction\\\\CreateActivityAction\\:\\:create\\(\\) has parameter \\$entity with no type specified\\.$#" count: 1 path: modules/custom/activity_basics/src/Plugin/ActivityAction/CreateActivityAction.php - message: "#^\\\\Drupal calls should be avoided in classes, use dependency injection instead$#" count: 1 path: modules/custom/activity_basics/src/Plugin/ActivityAction/CreateActivityAction.php
For PHP there are more clearly defined boundaries because the start and end of a rule are distinctly delineated by different tokens (the PHP open and close array syntax). An excerpt from the new PHP baseline from Drupal Core:
<?php declare(strict_types = 1); $ignoreErrors = []; $ignoreErrors[] = [ 'message' => '#^Call to function method_exists\\(\\) with \'Composer\\\\\\\\Composer\' and \'getVersion\' will always evaluate to true\\.$#', 'count' => 1, 'path' => __DIR__ . '/../composer/Composer.php', ]; $ignoreErrors[] = [ 'message' => '#^Method Drupal\\\\Composer\\\\Composer\\:\\:drupalVersionBranch\\(\\) should return string but returns string\\|null\\.$#', 'count' => 1, 'path' => __DIR__ . '/../composer/Composer.php', ]; $ignoreErrors[] = [ 'message' => '#^Parameter \\#2 \\$base_dir of method Drupal\\\\Composer\\\\Generator\\\\ComponentGenerator\\:\\:generate\\(\\) expects string, string\\|false given\\.$#', 'count' => 1, 'path' => __DIR__ . '/../composer/Composer.php', ];
This should help git better differentiate between different changes and reduce the number of merge conflicts that are actually flagged. It should also help developers better understand conflicts because they're dealing with PHP code, rather than the middle of a huge YAML list.
Secondly, however, if developers are manually fixing conflicts in the baseline file then I think we're doing a disservice to them. The baseline is the result of the changes in our code and is a file that's automatically generated by PHPStan so we really shouldn't be fixing conflicts manually. My recommendation here would be to educate developers to fix conflicts by regenerating the baseline file (during a rebase or merge commit):
git checkout <base-branch> -- phpstan-baseline.php vendor/bin/phpstan analyze --configuration=./core/phpstan.neon.dist --memory-limit=-1 --generate-baseline core/phpstan-baseline.php
PHPStan Baseline reviews are also pretty straightforward to know whether such a regeneration was successful. Any removals are improvements in code-type quality and will be caught by the code review itself, so they can be blindly accepted. Any additions need some attention and hopefully spark discussion for code improvements.
When we do end up going there, I'd want us to consider bringing in
phpstan/phpstan-strict-rules
I'm a big fan of the strict rules, but I think there's value in the higher PHPStan levels even without it. So I would propose we evaluate it in a follow-up issue :)
And at least having configs like:
reportUnmatchedIgnoredErrors: true checkMissingIterableValueType: false checkGenericClassInNonGenericObjectType: false
I can happily say that
reportUnmatchedIgnoredErrors
is enabled in PHPStan by default, so there's no need for us to add it to the config unless we want to turn it off (which we shouldn't).For
checkMissingIterableValueType
I already had a short discussion with mglaman on Slack which I think is worth repeating here:We can calculate the number of errors ignored by the baseline by counting the number of lines in the file (
wc -l core/phpstan-baseline.php
) subtracting 5 lines for setting up the array and returning it from the file and then dividing it by the 5 lines per rule that are used in the file.This brings us to 78772 ignored errors. Of those 19445 contain the sentence "in iterable type array"; roughly ~300 further errors contain missing "in iterable type" in a non-array types. I still think there is value in fixing those non-array type iterables (think about things like field item lists) and thinking about how we introduce other iterable types. So my proposal would be to manually ignore errors of the type
*in iterable type array*
then we can still fix other iterable types but at least don't need to go out and fix all the render arrays that are being pushed around immediately.Disabling
checkGenericClassInNonGenericObjectType
would remove another roughly ~500 ignored errors. Though I would argue that use of generic typing in Drupal might increase (as we add@template
annotation to indicate a class is generic over a certain type) and having that information available can make consuming such code a lot easier (because it ensures the return types don't need to bemixed
). So I'd urge keeping the generic check enabled. Disabling it would for example negate almost all benefits of the types for ✨ Implement a Result type in Drupal Core Needs review .On Slack alex.skrypnyk mentioned we probably want to have a core committer review this issue. Given the impact this issue has on the requirements for future code to be merged and reviewed, I agree. I believe the release manager tag is most appropriate (and they can escalate to "Needs committer feedback" if needed (since it's mentioned as use sparingly → ).
- 🇦🇺Australia acbramley
Firstly, I think the decision to use PHP for the baseline rather than NEON will help git reduce the number of conflicts
That would be nice, but I still need convincing 😅 - Maybe we open a separate issue to swap the baseline to use PHP as that seems much more likely to get in quickly?
My recommendation here would be to educate developers to fix conflicts by regenerating the baseline file
I agree, perhaps yet another issue to add a script to make this easier to run?
- Status changed to Needs work
10 months ago 11:08pm 7 March 2024 The Needs Review Queue Bot → tested this issue. It no longer applies to Drupal core. Therefore, this issue status is now "Needs work".
This does not mean that the patch necessarily needs to be re-rolled or the MR rebased. Read the Issue Summary, the issue tags and the latest discussion here to determine what needs to be done.
Consult the Drupal Contributor Guide → to find step-by-step guides for working with issues.
- 🇬🇧United Kingdom catch
That would be nice, but I still need convincing 😅 - Maybe we open a separate issue to swap the baseline to use PHP as that seems much more likely to get in quickly?
Yes this would be an easy change and then we can see how it goes with the existing level of conflicts we have to deal with.
Secondly, however, if developers are manually fixing conflicts in the baseline file then I think we're doing a disservice to them. The baseline is the result of the changes in our code and is a file that's automatically generated by PHPStan so we really shouldn't be fixing conflicts manually. My recommendation here would be to educate developers to fix conflicts by regenerating the baseline file (during a rebase or merge commit):
My old laptop, which was not that old - it was a 6th gen carbon x1 with 16gb ram, used to max out all CPUs and memory doing a full phpstan run and go into swap. I used to have to patch core to limit the number of CPUs that phpstan could use to 3 or 4 so that it would stay within bounds, then it would be OK, but sometimes I would forget - the config is in git so couldn't just change it permanently. On my current machine that doesn't happen, but it can still take a long time to do a full run - this is just the scan, not generating a baseline. I think @Gabor has run into that probably more recently than me on completely different hardware too.
- Status changed to Postponed
10 months ago 7:43am 8 March 2024 - 🇳🇱Netherlands kingdutch
I've split out the change from Neon to PHP into 📌 Convert the PHPStan baseline from NEON to PHP Fixed
That also includes the update to the GitLab CI to output the baseline in the PHP format there. The functionality to output the baseline as artifact was previously implemented in 📌 Incorporate improvements to how contrib runs PHPStan to core RTBC
This branch has been rebased on top of 11.x and the work from 📌 Convert the PHPStan baseline from NEON to PHP Fixed has been cherry-picked in to see the pipeline status in this one. That means this issue is now postponed on the other one :)
- 🇳🇿New Zealand quietone
Mentioned this to the other release managers and catch suggested adding a tag.
- 🇦🇺Australia larowlan 🇦🇺🏝.au GMT+10
I wonder if we can run this in CI against files in `git diff --diff-filter=A --name-only` i.e. require level 9 for all new files but avoid the conflict hassles of the baseline?
- 🇦🇺Australia mstrelan
I think we had that previously and it didn't really work, you need to scan the whole codebase.
- 🇳🇱Netherlands kingdutch
I wonder if we can run this in CI against files in `git diff --diff-filter=A --name-only` i.e. require level 9 for all new files but avoid the conflict hassles of the baseline?
The maintainer of PHPStan wrote a blogpost about why we shouldn't do that :D
It also makes intuitive sense because if I change the type of an interface then that's the only file that's changed. However, we know that changes in interfaces might affect all classes that implement them, so we can't rely on only analysing the interface.
- 🇺🇸United States cmlara
I want to throw this out there as a suggestion in case the baseline rebase becomes a blocker to this being accepted.
We could at least consider changing the .gitlab-ci code-quality report asset generation run of PHPStan to use L9 while the enforcement run remains L1.
This would allow us to visually see errors in GitLab and manually enforce fixing them in changed lines without otherwise failing the builds.
Not as ideal as having the MR fail however I would contend even that change would create an improvement in code quality compared to the current use of L1 only.
If we wanted to keep the L9 quality report small or see overall progress we could add a second PHPStan L9 baseline that we manually bump every few weeks.
IIRC a config file change invalidates the PHPStan cache so this would likely be trading a longer test execution time, or adding a second PHPStan Job to avoid wall time increase for higher quality code.
If we did add a second job we could consider use both reports, and possibly downgrade (using sed) the severity of errors in the L9 to info while allowing the L1 run to show Blocker to allow easily determine if it’s a “must fix” vs “should fix”
- 🇬🇧United Kingdom alexpott 🇪🇺🌍
I think there needs to be a conversation about adding a 15 megabyte file to the core directory and at the very least decent instructions about how to not deploy this to your site.
- 🇬🇧United Kingdom longwave UK
Looking at https://phpstan.org/user-guide/rule-levels (and from past experience with work projects) I am not sure we should go past level 5 in core just yet. My reasoning is that level 6 requires us to add typehints and we are still quite far from adding typehints everywhere - everyone wants to do it but there is a lot of backward compatibility to figure out first. In the MR, 21k of the almost 60k errors are "...has no return type specified".
- 🇺🇸United States cmlara
but there is a lot of backward compatibility to figure out first. In the MR, 21k of the almost 60k errors are "...has no return type specified".
In my experience with contrib the majority of these are usually due to our implicit expectation of
@return void
when the @return block is missing in docblocks. See https://www.drupal.org/node/1354#functions → and #1510838: Define and document @return void policy → .While we have a standard now that all new code should be using strict typehints and return typehints it relies on human review to observe.
In my opinion, having a baseline for this would also help us remove these warnings. Once we know what the errors are we can write automated tooling to bulk add
@return void
to any remaining methods (in small batches as the core team will likely not want to cause a mass need to re-roll all pending MR's at once) and by just reviewing for errors (anywhere we now see "should return void but returns something") we know we have an API breach that we need to exclude for further followup.Adding
@return void
would not have BC concerns as that is already the current specified for the method, we would just be documenting it in code. - 🇬🇧United Kingdom catch
Even if @return void is OK to add
: void
in interfaces/base classes etc. is not without breaking existing classes - i.e. we need to add it in a major release after triggering deprecations for extending/implementing classes.I don't think we should add a baseline which requires adding type hints to existing code until we actually have an approach for how to do that in core, which is 🌱 [Meta] Implement strict typing in existing code Active , 📌 Enable existing interfaces to add return type hints with a deprecation message for implementors Active etc. If we don't have the approach, we're adding baseline we can never remove. Help on those issues would be appreciated.
If bulk adding @return void to methods will reduce the baseline significantly (without having to add the actual return type hint), then a spin-off issue to do that first and bring the baseline down seems sensible too - even if there's a window where some could creep in between that commit and this one, seems like an easy way to get the file size down and reduce merge conflicts over time.
- 🇳🇱Netherlands kingdutch
I made a small script (attached as txt file) that allows us to use the PHP baseline to get some insight into what type of errors we're dealing with, which can be a bit more precise than my previous estimates. I still believe that we should embrace the baseline to make sure we don't increase the count of errors we're introducing in new code (and a lower PHPStan level simply isn't going to do that).
However, I am open to seeing if we can get better insight into the contents of the baseline and see if we can resolve some issues using PHPDoc annotations in parallel. I'd also be willing to invest some time in that if that helps acceptance of a baseline size and this issue landing sooner.
Here's the output of the current iteration of the script. I'd love to know if there's a preferred way of tackling these (e.g. I could add all return types to Drupal's functions as I expect them to be relatively straightforward, but that might be a patch with 2000 changed lines.
Analysing 59257 ignored PHPStan errors ................................................................................ ................................................................................ ................................................................................ ................................................................................ ................................................................................ ................................................................................ ................................................................................ ................................. 2026 functions without a return type specified. 533 methods on interfaces without a return type specified. 19064 methods on other classes without a return type specified. 169 unreachable pieces of code. 106 if-statements that will always be TRUE. 3 if-statements that will always be FALSE. 59 partial conditions always TRUE/FALSE. 346 undefined property accesses on an object. 642 undefined property access on other things. 36309 ignored errors without classification.
There was a thread in Slack about the issue raised in #24
I think there needs to be a conversation about adding a 15 megabyte file to the core directory and at the very least decent instructions about how to not deploy this to your site.
drumm mentioned "Composer does no archiving, its just automating downloading zip files from from GitHub & ftp.drupal.org, and git clone git.drupalcode.org & elsewhere". That means that the proposed solution by Mglaman of adding the following to
.gitattributes
undercore/
would ensure the file is not distributed through composer or Drupal.org archives./.gitattributes export-ignore /.phpstan-baseline.php
This does still let the baseline be downloaded when people install with composer using the
--prefer-source
flag which I would say is proper behaviour since the baseline is part of the drupal/core source. - 🇳🇱Netherlands kingdutch
Adjusted the script to gain a bit more insights at level 9 :)
The lines that say "always" or "unnecessary" is where PHPStan allows us to remove code :D
Operations on "may be NULL" or "may be FALSE" are code-paths that may manifest in bugs.Analysing 59257 ignored PHPStan errors ................................................................................ ................................................................................ ................................................................................ ................................................................................ ................................................................................ ................................................................................ ................................................................................ ................................. 2026 functions without a return type specified. 533 methods on interfaces without a return type specified. 19064 methods on other classes without a return type specified. 169 unreachable pieces of code. 106 if-statements that will always be TRUE. 3 if-statements that will always be FALSE. 59 partial conditions always TRUE/FALSE. 346 undefined property accesses on an object. 642 undefined property accesses on other things. 4085 method calls on values that may be NULL. 413 method calls on values that may be FALSE. 1032 property access on values that may be NULL. 186 property access on values that may be FALSE. 1478 values returned that don't match the return type. 427 invalid types supplied to foreach. 69 binary operations resulting in error. 3644 calls to undefined method (usually value is not of specific enough type). 2260 unsupported operation on mixed type. 21 calls to undefined function/method. 72 calls to assert which isn't needed. 11 calls to is_array which are always FALSE. 33 calls to is_array which are always TRUE. 2 unnecessary calls to is_bool. 2 unnecessary calls to is_callable. 2 unnecessary calls to is_int. 13 unnecessary calls to is_null. 5 unnecessary calls to is_numeric. 4 unnecessary calls to is_object. 1 unnecessary calls to is_resource. 30 unnecessary calls to is_string. 3 unnecessary calls to is_subclass_of. 47 unnecessary calls to method_exists. 5 unnecessary calls to property_exists. 570 unnecessary PHPUnit Assert. 1285 functions with missing parameter type. 4967 method with missing parameter type. 229 other statements are always TRUE. 388 other statements are always FALSE. 654 drupalLogin calls may receive FALSE. 6964 incorrect type of parameter provided for method call. 7407 ignored errors without classification.
- 🇺🇸United States bradjones1 Digital Nomad Life
Crazy idea/question - what size is the baseline file, compressed? Given it is text and (presumably) has a ton of repetition in it, it could be put in the repo in a compressed state (still with some prevention to have it packaged) and that helps solve the repo size issue? That does make it harder to diff "at rest," but uh, yeah. Crazy idea.
- 🇳🇱Netherlands kingdutch
I don't think modifying the baseline further or moving it out of the repository is a good idea. It would make the development experience of running PHPStan locally (and updating the baseline) too cumbersome.
While I do understand that 15 MB is a lot on the scale that Drupal is operating, we should also place the size of the file in perspective. I think our primary goal should be that people who deploy Drupal to production shouldn't suffer its cost. For that we have a good proposal which I've committed to the branch (although it does need testing which I can not do against a dev branch in a fork). I believe that as long as the file is not bundled in the archives on Drupal.org or in the dist version of Composer that's on Packagist then we've achieved that goal.
It's unrealistic to try and save the same 15MB when dealing with the raw source repositories (e.g. a
git clone
or--prefer-source
composer install) because those are likely scenarios in which you want to have the baseline present. For an initial clone ofdrupal/drupal
the new baseline file would represent a 7% increase in repo size; for an inital clone ofdrupal/core
the new baseline file would represent an 8% increase in repo size (counting the existing baseline as 0-size in both cases).However that size would only be downloaded once, since it's a text file that git can efficiently update rather than having to re-download it on every pull request.
Additionally we know that this is a file where our aim is to make it smaller quickly (because we'd be addressing real issues and improving the developer experience of anyone using Drupal) and it's not a file that will stay this big indefinitely.
Fresh git clones of both drupal/drupal and drupal/core to show the size of the repo's inital clone for comparison to the increase caused by the baseline.
$ git clone git@git.drupal.org:project/drupal.git Cloning into 'drupal'... remote: Enumerating objects: 967424, done. remote: Counting objects: 100% (12915/12915), done. remote: Compressing objects: 100% (2246/2246), done. remote: Total 967424 (delta 11040), reused 10823 (delta 10660), pack-reused 954509 Receiving objects: 100% (967424/967424), 230.15 MiB | 4.31 MiB/s, done. Resolving deltas: 100% (699506/699506), done. Updating files: 100% (17353/17353), done. $ git clone git@github.com:drupal/core.git Cloning into 'core'... remote: Enumerating objects: 767457, done. remote: Counting objects: 100% (19190/19190), done. remote: Compressing objects: 100% (8354/8354), done. remote: Total 767457 (delta 10741), reused 18556 (delta 10149), pack-reused 748267 Receiving objects: 100% (767457/767457), 186.46 MiB | 13.95 MiB/s, done. Resolving deltas: 100% (542285/542285), done. Updating files: 100% (17264/17264), done.
- 🇺🇸United States mfb San Francisco
Looks like when compressed in a git packfile, this baseline compresses pretty well, down to 851KB of space
- 🇬🇧United Kingdom longwave UK
From a release manager perspective I'm generally fine with the filesize in the core development repo for the reasons stated but agree that we should remove it from
drupal/core
installs (composer and tarball) because end users simply don't need this, it's only useful for core development. - Status changed to Needs work
9 months ago 3:15pm 12 March 2024 - 🇳🇱Netherlands kingdutch
The other issue landed! Marking this as "Needs work" because it needs a rebase.
In #3426548-21: Convert the PHPStan baseline from NEON to PHP → mstrelan mentioned
Since this file is php do IDEs like PhpStorm try to analyse it? It can obviously be excluded but does that involve manual steps?
This needs some more investigation to impact of the large file. My initial response was:
I can not answer for all IDEs but only for PHPStorm. PHPStorm does seem to index the file. As far as I found there's nothing we can do against this with things we ship in the repo. I also don't know if it's a problem (i.e. in what ways it affects PHPStorm usage).
There are settings in PHPStorm that allow excluding files which are global settings that apply to all projects. So in case the analysis of the file by PHPStorm causes problems for someone it's a one-time fix.
longwave's feedback from #33 was previously addressed in the commit between #30 and #31 assuming the packaging works how we thought it did on Slack.
- Status changed to Needs review
9 months ago 5:10pm 12 March 2024 - 🇳🇱Netherlands kingdutch
I did some local experimentation with PHPStorm by opening it on the 11.x branch. This allows PHPStorm to settle on 2,23GB of memory used according to the activity monitor. I then switched to the
3426047-bump-phpstan-to-level-9
branch for this issue. The indexing time that PHPStorm needed was near instant (basically a progress bar flash) and memory changed to 2.63GB. I'm not sure how scientific this is and whether all 0.4GB of memory pressure can be attributed to the baseline or whether PHPStorm does other things when switching branches that require memory.I added ".phpstan-baseline.php" to "Exclude files" under "Settings > Directories" and retried the experiment. I opened PHPStorm on the 11.x branch which settled at 2.25 GB. This time no re-indexing seemed to be triggered and memory usage changed to 2.36GB.
- 🇺🇸United States bradjones1 Digital Nomad Life
Thanks for addressing my hairbrained ideas on the file size.
My perspective on this is tainted a bit by working within codebases that have this problem at scale - e.g., Expo. The repository is legitimately bloated to the point it's difficult to download in a reasonable time on anything other than fiber. We are not there and this is a rare situation.
Additionally we know that this is a file where our aim is to make it smaller quickly (because we'd be addressing real issues and improving the developer experience of anyone using Drupal) and it's not a file that will stay this big indefinitely.
This is true for the _current_ file, but cloning with no depth limit will still pull in all the prior revisions. Not really gonna change anything, but unless we rewrite history or people routinely clone without depth limit (could be an idea for CI if it isn't already) then this is "here to stay" once it happens.
- 🇦🇺Australia mstrelan
Thank you for the PhpStorm test. I did similar and agree that indexing after switching to the MR branch finished very quickly and no noticeable change to memory usage. I foolishly decided to open the 200 000 line file expecting everything to freeze but it opened just fine, although CPU did increase a lot, and remained elevated until I closed the file. The file is not really meant to be opened anyway.
I noticed the file has been excluded from phpstan.neon.dist and phpcs.xml.dist, but should it also be excluded from .cspell.json?.
- Status changed to Needs work
9 months ago 10:28pm 12 March 2024 The Needs Review Queue Bot → tested this issue. It fails the Drupal core commit checks. Therefore, this issue status is now "Needs work".
This does not mean that the patch necessarily needs to be re-rolled or the MR rebased. Read the Issue Summary, the issue tags and the latest discussion here to determine what needs to be done.
Consult the Drupal Contributor Guide → to find step-by-step guides for working with issues.
- 🇺🇸United States mglaman WI, USA
Actually, the needs review bot and open MR should tell us how often there are conflicts or other issues, right? For that concern.
Drupal core really needs gitattributes to exclude the export of the file (and more.) Then it won't be in the downloaded artifact. Maybe if someone is patching core and Composer downloads via Git they will.
- Status changed to Needs review
9 months ago 12:46pm 17 March 2024 - 🇳🇱Netherlands kingdutch
Rebased the MR on top of the 47 new commits on the 11.x branch. Also added the baseline file to CSpell as requested in #37 and #40.
I ran the same analysis script I attached in #29 to give an indication of how many new errors (according to the higher rule level) were added in those 47 commits. I've attached the analysis output and manually added (+x) increase, (-x) decrease, or (~) equal indicators for each rule.
Analysing 59398 (+141) ignored PHPStan errors ................................................................................ ................................................................................ ................................................................................ ................................................................................ ................................................................................ ................................................................................ ................................................................................ .................................. 2028 (+2) functions without a return type specified. 533 (~) methods on interfaces without a return type specified. 19091 (+27) methods on other classes without a return type specified. 171 (+2) unreachable pieces of code. 105 (-1) if-statements that will always be TRUE. 3 (~) if-statements that will always be FALSE. 60 (+1) partial conditions always TRUE/FALSE. 346 (~) undefined property accesses on an object. 642 (~) undefined property accesses on other things. 4110 (+25) method calls on values that may be NULL. 413 (~) method calls on values that may be FALSE. 1034 (+2) property access on values that may be NULL. 186 (~) property access on values that may be FALSE. 1487 (-1) values returned that don't match the return type. 427 (~) invalid types supplied to foreach. 69 (~) binary operations resulting in error. 3650 (+6) calls to undefined method (usually value is not of specific enough type). 2263 (+3) unsupported operation on mixed type. 19 (-2) calls to undefined function/method. 72 (~) calls to assert which isn't needed. 11 (~) calls to is_array which are always FALSE. 33 (~) calls to is_array which are always TRUE. 2 (~) unnecessary calls to is_bool. 2 (~) unnecessary calls to is_callable. 2 (~) unnecessary calls to is_int. 13 (~) unnecessary calls to is_null. 5 (~) unnecessary calls to is_numeric. 4 (~) unnecessary calls to is_object. 1 (~) unnecessary calls to is_resource. 30 (~) unnecessary calls to is_string. 3 (~) unnecessary calls to is_subclass_of. 48 (+1) unnecessary calls to method_exists. 5 (~) unnecessary calls to property_exists. 573 (+3) unnecessary PHPUnit Assert. 1287 (+2) functions with missing parameter type. 4987 (+20) method with missing parameter type. 229 (~) other statements are always TRUE. 391 (+3) other statements are always FALSE. 656 (+2) drupalLogin calls may receive FALSE. 6990 (+26) incorrect type of parameter provided for method call. 7417 (+10) ignored errors without classification.
Actually, the needs review bot and open MR should tell us how often there are conflicts or other issues, right? For that concern.
I'm not entirely sure. A merge conflict occurs when two parallel changes touch the same lines of code and git is not able to determine how to reconcile those. This MR makes + 295436/− 1079 changes to the baseline (since we include all currently known possible errors), so it essentially touches every line in the existing baseline and any other change will have a conflict with it once.
The normal number of changes in the baseline are much lower than 300k lines. With that in mind, the chance of two MRs touching the same lines of code in the large baseline go down significantly.
This is true for the _current_ file, but cloning with no depth limit will still pull in all the prior revisions. Not really gonna change anything, but unless we rewrite history or people routinely clone without depth limit (could be an idea for CI if it isn't already) then this is "here to stay" once it happens.
Unless I've missed something the only open concern (besides the potential for merge conflicts) is that because of the way git works (it captures history) the increase to the repository size with the new baseline size would be permanent for the lifetime of the project. This would affect clones which do not limit depth (by default clones fetch all history depth); we have taken steps to ensure this doesn't affect production deployments which don't use git.
I don't have a good solution for this and with access to 4G/5G and fiber in all my workplaces I think I might be slightly too privileged to have a proper say in this matter. Git has improved support for very large repos (measured in Gigabytes, so much larger than where Drupal is at) but these do require steps taken on the consumer side and are not something we can do from the repository side of things.
- Status changed to Needs work
9 months ago 12:01pm 20 March 2024 The Needs Review Queue Bot → tested this issue. It no longer applies to Drupal core. Therefore, this issue status is now "Needs work".
This does not mean that the patch necessarily needs to be re-rolled or the MR rebased. Read the Issue Summary, the issue tags and the latest discussion here to determine what needs to be done.
Consult the Drupal Contributor Guide → to find step-by-step guides for working with issues.
- 🇬🇧United Kingdom alexpott 🇪🇺🌍
I've been running level 9 checks against the new recipe initiative code for quite some time and the checks are proving quite frustrating even for experienced contributors. The job we have does generate a baseline for you and everything but so many of the warning cannot be resolved because core is so non-compliant. See the recent commits to https://www.drupal.org/project/distributions_recipes/issues/3422821 → for an example of how frustrated a contributor gets. I think we need to be really careful here.
- 🇬🇧United Kingdom catch
The job we have does generate a baseline for you and everything but so many of the warning cannot be resolved because core is so non-compliant.
For me having to add entries baseline (which we'd have to add to core too unless we blocked issues like that on core refactoring) would undermine the entire point of this, it would break the pattern of not adding new entries to the baseline. At the moment we only ever do that for dependency updates (and then often immediately open follow-ups to remove them again), not for regular core MRs.
- 🇬🇧United Kingdom longwave UK
That is basically what I was trying to say back in #4. I think maybe level 5 or so is a reasonable compromise but level 9 will require adding to the baseline for any non trivial issues and that will deter contributors.
- 🇺🇸United States phenaproxima Massachusetts
Oh, dear god, please don't do this.
As @alexpott said in #43, level 9 is a nightmare. Core's current state makes it basically impossible to write even very strict code.
Here's an idea: what if we progressively got to level 9, over time? Let's clear out everything in the current baseline, then bump it to level 2 or 3, clear out everything in that baseline, and repeat until we get to level 9. That will keep the tasks manageable and relatively straightforward.
I am certain that going directly to level 9 will effectively halt core contribution until a lot of disruptive issues are fixed. Inexperienced contributors will never, ever want to go near Drupal core again.
- 🇬🇧United Kingdom longwave UK
Let's clear out everything in the current baseline, then bump it to level 2 or 3
This is the current plan anyway, see e.g. issues tagged with PHPStan-1 or 📌 [meta] Fix as many PHPStan level 2 issues as possible before bumping the rule level Active
- 🇳🇱Netherlands spokje
Let's clear out everything in the current baseline, then bump it to level 2 or 3
This is the current plan anyway,
To be fair, that plan is not going fast, or to be more precise, seems to have been grinded to a halt.
There's so much BC in play with fixing even the low hanging issues, that even determining what arethe low hanging fruits is difficult and leads to much discussion.
I myself have (almost) given up on working on PHPStan base-level issues, since they really suck the life-force out of my contributing spirit.
Having said that, although I can see some advantages in upping to level 9 and at the very least be warned when new PHPStan issues creep in, the sheer amount that would have to be committed to the baseline because of the BC-promises made, would be nullifying those advantages IMHO.
So I personally don't see advantages upping to any level above the current level 1, until we have either more persons willing to contribute on fixing the current issues with the current PHPStan level and/or have a long hard look at BC policies we currently have and if they're perhaps holding us back to much.
But of course, n=1, personal opinion and all that.
- 🇬🇧United Kingdom catch
I can see the arguments for going to a higher baseline so that new code is compliant, rather than adding to the amount to be fixed later, but this should mean that once we've generated the baseline once, it should either stay the same or go down, not get bigger.
@Spokje with bc policies do you mean type hinting existing code (which is where progress is almost zero) or more generally?
- 🇦🇺Australia alex.skrypnyk Melbourne
@Kingdutch
I found it very difficult to understand that you are proposing to enforce lvl 9 for NEW code only.
Could you please update the title of the issue and maybe put something in bold in the description to clearly state what this issue is about.
Thank you - 🇺🇸United States bradjones1 Digital Nomad Life
Re: #50, that's the point of the baseline. Any new code cannot add to the baseline.
- 🇳🇱Netherlands kingdutch
Thanks for the input everyone! And thanks alexpott and phenoxproxima for sharing your experiences with level 9 in the recipe fork.
Responding to the recipe example
If I look at the commits it's a bit difficult for me to see where exactly the friction is coming from (don't get me wrong, I do see the friction). There's two points I think I could summarize it as (and please correct me if I'm wrong): firstly a learning curve, secondly Drupal's use of arrays.
The learning curve is very understandable and I do think it's something we'll have to guide the Drupal community through. We're essentially going from a very loosely "do what you want" language to a very strictly typed language. Adding
@var
annotations or preferablyassert
statements (since they fail your test if your assumption is wrong) is a part of the journey we'll have to take. Regardless of our strategy with PHPStan levels, that's something we'll have to go through at some point. The good news is that PHPStan will tell us if asserts are wrong (and thanks to Drupal using Bleeding Edge will also tell us if@var
statements are likely wrong). That provides us with a way to fix issues at the call site ("I know this is type X") while allowing us to fix the origin of the bug (the actual type annotation) and then have PHPStan tell us which call-sites no longer need the asserts/annotations.The second thing is Drupal's use of arrays. This is the only thing that ended up in the baseline. It's probably the hardest thing to fix in Drupal regardless of levels because it'll require re-thinking how we pass around our render data and define a more well-defined structure for it. This is exactly why this issue proposes a singular exception to the strict rules of PHPStan (from the issue summary):
Exceptions added
In order to not frustrate Drupal's use of render arrays and instead allow people to slowly start adding types to those we ignore the error ... has no value type specified in iterable type array. We do not disable the type requirement for all iterable types because specifying them does have a lot of value outside of render arrays in improving the type-safety of loops in code.We do not disable generics altogether because there's a lot of places (like Drupal's Entity API which is now partially patched by PHPStan Drupal) where generics can really add value in telling downstream code what type of value exists within a container. Introducing generics without immediately validating them with PHPStan is likely to cause us even greater pain in the future, hence why they're only disable for arrays and we keep them for other classes (where assertions/@var annotations can more easily be written if needed).
Why running PHPStan on a lower level is a fallacy
It's a bit challenging that PHPStan called its configuration steps "levels" because they make people think of them like "levels in a video game" where if you beat one level, the next one becomes easier. However, that's not really the case for PHPStan. I personally see them more as "groupings of complexity" (although depending on your code-base some higher levels might be easier than lower levels).
The challenge with Drupal's current approach is that while we're slowly fixing some issues at the lower levels, we're steadily introducing new things to fix at higher levels which are just sitting there waiting. The value across the levels also isn't equal. For example we all agree that type-hints are incredibly useful but these aren't checked until level 6 (that's the same level that introduces the generics that many people stumble over). It's not until level 8 that PHPStan will help us out in preventing "calling some method on NULL" which are errors that, if they slip through tests, will case a white screen of death for a site. The only difference between level 8 and 9 is PHPStan being strict about "mixed" which is a good thing because without that rule people might see "mixed" as a solution over finding the more specific union they're probably actually interested in. PHP's
mixed
is akin to TypeScript'sany
, which the TypeScript community has long since learned is dangerous. To remedy that they've introducedunknown
which changes any from "could by anything" to "you must figure out what this is", that's basically all PHPStan's level 9 does over level 8.Some statistics
To back-up my claim that even though we're slowly chipping away at lower levels, we're introducing issues at higher PHPStan levels at a higher rate that we'll want to fix in some future, I've analysed ~2100 commits from 10.1.x until 11.x (around the end of March) for Level 1 and Level 9 of PHPStan and plotted their difference. This is based on the analysis script that I shared in this issue earlier but with all errors grouped into a specific category/explanation.
The scripts used to do this as well as the raw result data can be found in https://github.com/Kingdutch/analyse-drupal-phpstan
As we can see we're slowly chipping away at the 600 errors that are in our current baseline. Meanwhile we're introducing about 2 errors per commit on average, which dwarfs the work that we're doing at lower levels as we climb from about 55000 errors to nearly 60000 errrors.
This is to me what spurs my motivation to try and change our approach to how we tackle code quality since I believe the benefits that preventing the errors shown by PHPStan can bring us truly outweight the effort needed to prevent them.
In an attempt to make discussing the different levels of PHPStan based on data rather than my own or others' feelings I've analysed commit ec28f25d1e on the 11.x branch at level 1 through 9 and ran the results through the analysis script. The resulted categorisation is listed in a table below to show which errors and how many of them there are for the different levels of PHPStan.
Based on the table above I would love to hear the value people assign to the different categorised errors and whether it's worth preventing them. For example I value preventing null-based bugs because I've seen them as errors a lot and similarly believe that "mixed" is not a good type to use in 90% of circumstances. However if people disagree with me that might give us a more data and decision driven approach to finding which level we should land on going forward. Additionally if that's not 9 it may give us some ideas of what criteria we want to adhere to to move to higher levels in the future and how to ensure the higher levels don't grow as we fix lower level issues.
- 🇦🇺Australia larowlan 🇦🇺🏝.au GMT+10
Thanks for that amazing analysis @Kingdutch
If we did accept a larger baseline and bump the level, do you think it would make sense to then take a horizontal approach instead of a vertical approach to reducing the baseline size? e.g. Working by rows in your table, instead of by columns (which is what we're doing now).
It does feel like a lot of those rows would be easy wins but also with a high return. There are definitely some rows where reward vs effort vs disruption would definitely be difficult.
- 🇳🇿New Zealand quietone
@Kingdutch, thank you for the data, it is very helpful.
Or, another idea - add a level 9 job to core's CI pipeline, but allow it to fail. That will at least allow us to know if we're introducing new errors, whilst not blocking on preexisting problems.
This seems reasonable and something to explore.
- 🇳🇱Netherlands kingdutch
If we did accept a larger baseline and bump the level, do you think it would make sense to then take a horizontal approach instead of a vertical approach to reducing the baseline size? e.g. Working by rows in your table, instead of by columns (which is what we're doing now).
It does feel like a lot of those rows would be easy wins but also with a high return. There are definitely some rows where reward vs effort vs disruption would definitely be difficult.
Yeah I think that would absolutely be the way to go. I do think there's a challenge that they may not be simple find/replace, so I'm not sure how large we can make batches. For example we may introduce a return type in the PHPDoc of an interface and then fix our classes to adhere to them (where we can introduce it in PHP). That in turn may cause PHPStan to say "Hey these type assertions you're making here are no longer needed" and we can remove those. So there's connection between the different rules.
The benefit is that the higher level of PHPStan can give us the confidence that if the types are correct at a local level (e.g. we make sure to accept
string|\Stringable
) they'll also help us out on the project level. That should be a big help to reviewers for these kinds of issues even if they're not copy/paste.Or, another idea - add a level 9 job to core's CI pipeline, but allow it to fail. That will at least allow us to know if we're introducing new errors, whilst not blocking on preexisting problems.
I saw the suggestion and like it in theory but I'm struggling with a viable way of making it work. One of the differences with PHPStan vs tools like CSpell/PHPCS is that we always want to run it on the entire codebase because changes in one file can result in behaviour changes in any other file. So with that in mind I think this route would still require us to commit and update the baseline in case of errors, but there'd be a discussion of which errors are then acceptable.
If we don't commit (or update) the baseline then it's going to be really difficult to say what part of the code caused the new errors and what errors were introduced in previous commits. I think that probably leads to more frustration.
The only way I can come up with to do this and not commit the baseline would be to generate the baseline from the base commit of the PR and then run it against HEAD of the PR, so only the change in errors would be reported over which reviewers could make a decision of whether it's acceptable or not. So basically run PHPStan twice in the CI. The downside of this is that it does make people reliant on the CI for their PHPStan work and does not allow to easily perform these checks locally (unless they also run it twice locally). A composer install between runs might be needed because updated dependencies can affect type information.
- 🇫🇷France andypost
As baseline been converted to PHP, now it can be preloaded to negate size-effect of it on performance, ref 📌 Convert the PHPStan baseline from NEON to PHP Fixed