[Meta] Implement strict typing in existing code

Created on 25 April 2019, over 5 years ago
Updated 15 July 2024, 6 months ago

PHP 7 introduces (return and scalar parameter) type declarations. In #2928856: Adopt the PSR-12 standard for PHP7 return types β†’ there is discussion going on about adopting the PSR-12 standard for using return type declarations. However, as noted by drunken monkey β†’ implementing type hints in existing code in any form would be a breaking change.

The problem comes down to the fact that introducing type-hints in code that is used will make PHP complain that the contract isn't properly implemented by extending objects (see example 1). Implementing the type hints ahead of time does not break but produces warning because the extending case may be stricter (so not drop-in compatible) than the extended object (see example 2).

Issue #2928856: Adopt the PSR-12 standard for PHP7 return types β†’ talks about implementing type declarations for new code. However, this still leaves us with a large codebase where the power of strict typing is not yet used to catch bugs ahead of time.

It isn't possible to add parameter type hints, either in extending or extended code unless it's done simultaneously, so we have to workaround this limitation one way or another:

Examples

Example 1

Adding type-hints in code that is extended without adding code to those extensions is problematic.


class Base {
    public function foo(int $x) : array {
        return [];
    }
}

class Child extends Base {
    public function foo($x) {
        return ["array"];
    }
}

$c = new Child();

echo $c->foo("bar");

Produces:
Fatal error: Declaration of Child::foo($x) must be compatible with Base::foo(int $x): array in [...][...] on line 13

Example 2

The opposite is also sort-of true, introducing type hints when the extended object has not already done so produces warning.


class Base {
    public function foo($x) {
        return [];
    }
}

class Child extends Base {
    public function foo(string $x) : string {
        return "great " . $x;
    }
}

$c = new Child();

echo $c->foo("bar");

Warning: Declaration of Child::foo(string $x): string should be compatible with Base::foo($x) in [...][...] on line 13

Example 3

Remove the parameter from the parent method altogether, use func_get_args() (possibly with runtime type checking).

The child class can then update the method signature to be Drupal 10 compatible, without breaking compatibility with Drupal 9 (where the method has been changed already).

<?php

class Base {
    public function foo() {
       $x = func_get_args()[0];
        return "great" . $x;
    }
}

class Child extends Base {
    public function foo(string $x = '') : string  {
        return "great" . $x;
    }
}

$c = new Child();

echo $c->foo("bar");

?>

Return type hints

class Base {
  function foo () {
      return 'foo';
  }
 }
 
 class Child extends Base {
    function foo() : string {
      return 'foo';
    }
}

$child = new Child();
echo $child->foo();

Child classes can add return type hints without them being present on the parent, in this case we need to find a way to indicate (code comments, change records, rector) what changes need to be added in advance before adding the return type hint to core methods.

Proposed Solution

One saving grace that we have is that our coding standards already tell us to use PHPDoc type declarations all over Drupal. This means that we could use (an improved version of) a tool such as https://github.com/dunglas/phpdoc-to-typehint. Building this tool means that it can be used to convert Drupal core (possibly in reviewable chunks as sub-tasks of this issue) as well as contrib modules. It can then also be used for any downstream dependencies that we utilise.

🌱 Plan
Status

Active

Version

11.0 πŸ”₯

Component
OtherΒ  β†’

Last updated about 1 hour ago

Created by

πŸ‡³πŸ‡±Netherlands kingdutch

Live updates comments and jobs are added and updated live.
  • Needs issue summary update

    Issue summaries save everyone time if they are kept up-to-date. See Update issue summary task instructions.

Sign in to follow issues

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

  • πŸ‡¦πŸ‡ΊAustralia mstrelan

    Opened πŸ“Œ Fix strict type errors detected by phpstan Active which found 271 errors that need fixing.

  • πŸ‡¨πŸ‡¦Canada Liam Morland Ontario, CA πŸ‡¨πŸ‡¦

    What about adding type declarations in places where it is not a breaking change? For example, in Unicode::validateUtf8() and Xss::filter(), if you pass a non-string as the text parameter, that value will be used in a call to strlen(). This will result in the error message "Argument #1 ($string) must be of type string".

    If we added type declarations to the two functions I mention, then there would be an error message when those functions are called instead of when strlen() is called. That would make it easier to track down the code that is ultimately causing the problem.

  • πŸ‡ΊπŸ‡ΈUnited States mfb San Francisco

    @Liam Morland currently drupal and PHP allow \Stringable objects, strings, integers, floats and Booleans to be passed to these functions. NULL is also allowed, but triggers a PHP deprecated notice. So to avoid a breaking change, you'd have to allow that full set of accepted types.

  • πŸ‡ΊπŸ‡ΈUnited States phenaproxima Massachusetts

    Get a load of this: https://git.drupalcode.org/project/drupal/-/merge_requests/7196/diffs

    And check out its pipeline, which actually compiles and executes! (Although it doesn't pass tests, but it looks like the failures are due to pre-existing type-related bugs and weaknesses.)

    Thanks to contravariance, it seems we can safely add type hints to untyped method parameters in core interfaces and classes. Any subclasses' untyped parameters will be implicitly treated as mixed, which lines up with the contravariance rules. It's not possible for subclasses to have added parameter type hints, since that would violate contravariance.

    How would we want to go about it, if we wanted to do it? Maybe one big MR to add parameter type hints to all core service interfaces?

  • πŸ‡¬πŸ‡§United Kingdom catch

    Note that there's also https://www.drupal.org/about/core/policies/core-change-policies/drupal-d... β†’ for changing type hints, adding arguments, re-ordering methods. If we can just add type hints to interfaces and methods that don't have them, we should add a section for this too.

    How would we want to go about it, if we wanted to do it? Maybe one big MR to add parameter type hints to all core service interfaces?

    If we can script things so that we can apply type hints based on the phpdoc, then one big issue sounds great.

    However given that tests failed in the canary MR, it's likely we'll find incorrect docs, or wrong types being passed and etc., and have to split those out.

    There's also the question of whether we'd want to only do this in 11.x, I guess probably yes since it'll be an API change if callers are relying on loose typing.

  • πŸ‡ΊπŸ‡ΈUnited States phenaproxima Massachusetts

    would we need to tell contrib not to add the new type hints until they drop support for 10.x?

    That seems reasonably fair to me. "Drupal 11 uses strict parameter typing. If your subclasses aren't using strict typing, no worries -- your code won't break. Strict typing is optional, and will continue to be optional. BUT, if you want to use the strict typing, you must drop Drupal 10 support."

  • πŸ‡¦πŸ‡ΊAustralia mstrelan

    If we can script things so that we can apply type hints based on the phpdoc, then one big issue sounds great.

    FWIW rector has a set of existing rules for adding type declarations, for example AddParamTypeFromPropertyTypeRector for getters and setters. Unfortunately there doesn't seem to be a rule for AddParamTypeFromPhpDoc or similar, but there are a bunch of helpers in there for inspecting php docs. I think if we were to script this we should start by contributing the rule to rector.

  • πŸ‡¦πŸ‡ΊAustralia mstrelan

    I did a simple test modifying the existing \Rector\Php80\Rector\FunctionLike\MixedTypeRector and replacing checks for mixed and MixedType with string and StringType. This was successfully able to detect untyped params that had string types in their docs and add the string type declaration. We could certain rinse and repeat for primitive types like int and bool, and I'm sure with a bit more time we could make this generic across all types.

  • πŸ‡¦πŸ‡ΊAustralia mstrelan

    Opened πŸ“Œ Add string type to untyped string params Active to demonstrate #50.

  • πŸ‡³πŸ‡±Netherlands bbrala Netherlands

    Yeah the phpdoc rector was removed a while back. We might be able to get the code and revive it, but rector has had quite te changes to internals since.

  • πŸ‡ΊπŸ‡ΈUnited States mfb San Francisco

    Just read thru this issue and thought I should correct my comment in #45 - I was thrown off by the "Implement strict typing" title which made me think of strict_types, but this issue seems to be more about just adding type declarations, not strict_types. So in that context, adding a string type declaration with \Stringable objects being passed in should be fine.

  • πŸ‡¬πŸ‡§United Kingdom longwave UK

    Yep there is nothing stopping us - except actual bugs where our types are wrong - from adding argument types across core now, and contrib can follow at a later time.

    I think to avoid contrib doing anything twice though we should try to consider return types too, although they have to be done in the other order - bottom of the class hierarchy first. Core can add argument types now, we tell contrib to add both argument and return types for Drupal 11, then core can add return types everywhere in Drupal 12.

  • πŸ‡¨πŸ‡¦Canada Liam Morland Ontario, CA πŸ‡¨πŸ‡¦

    one big issue sounds great

    The concern I have with one big issue is that there will be a bunch of work done to get the patch ready and then further commits will make it out-of-date. The review will say it needs a re-roll. This process will just repeat forever. If we commit patches that add declarations to an individual class, it will get done.

    We could encourage people to add type declarations anytime they are working on a class that doesn't have them.

  • πŸ‡ΊπŸ‡ΈUnited States phenaproxima Massachusetts

    Re #55:

    there will be a bunch of work done to get the patch ready and then further commits will make it out-of-date

    Not necessarily. πŸ“Œ Add string type to untyped string params Active only changes the interfaces, which change much less frequently than concrete classes. Besides, it's scripted - that means it's easily rerolled if needed. And thirdly, it's not exactly an issue that needs to take forever because it's adding or changing APIs/features. Once tests pass, and everyone agrees on which branches to commit it to, etc., and a change record is written, it's not going to be hard to land.

  • πŸ‡³πŸ‡±Netherlands kingdutch

    While I'm happy that a lot of debate has sparked about how we can already add types to existing interfaces, the prevailing thoughts still seems to be "We should fix types first and then we can raise PHPStan".

    I personally wouldn't feel comfortable even adding the types to existing interfaces, especially not in a large MR until we've verified that the PHPDoc types for the interfaces are correct, that is something that PHPStan can help us with.

    For example in the canary MR the following change was made (unchanged PHPDoc added for needed context):

       /**
        * Creates new handler instance.
        *
        * Usually \Drupal\Core\Entity\EntityTypeManagerInterface::getHandler() is
        * preferred since that method has additional checking that the class exists
        * and has static caches.
        *
        * @param mixed $class
        *   The handler class to instantiate.
        * @param \Drupal\Core\Entity\EntityTypeInterface $definition
        *   The entity type definition.
        *
        * @return object
        *   A handler instance.
        */
    -  public function createHandlerInstance($class, EntityTypeInterface $definition = NULL);
    +  public function createHandlerInstance(string $class, EntityTypeInterface $definition = NULL);
    

    PHPStan (at a sufficient level) would complain about the mismatch between string in the code and mixed in the PHPDoc.

    The implementation of EntityTypeManager looks as follows:

        if (is_subclass_of($class, 'Drupal\Core\Entity\EntityHandlerInterface')) {
          $handler = $class::createInstance($this->container, $definition);
        }
        else {
          $handler = new $class($definition);
        }
        if (method_exists($handler, 'setModuleHandler')) {
          $handler->setModuleHandler($this->moduleHandler);
        }
        if (method_exists($handler, 'setStringTranslation')) {
          $handler->setStringTranslation($this->stringTranslation);
        }
    
        return $handler;
    

    is_subclass_of can take a class instance but also a class string. The preferable input is likely to be a class-string (since a class instance wouldn't actually be used). So a decision has to be made here on an interface level what the intended inputs are. Most likely this is a class-string for an EntityHandlerInterface. That provides EntityTypeManager with a deprecation path forward for its else statement that other classes could follow.

    Using the expressiveness of PHPStan's extended types in the PHPDoc and PHPs more limited type system in the PHPCode, unless we want to make a conscious change of how the entity type manager works, the interface probably should be changed to:

      /**
       * Creates new handler instance.
       *
       * Usually \Drupal\Core\Entity\EntityTypeManagerInterface::getHandler() is
       * preferred since that method has additional checking that the class exists
       * and has static caches.
       *
       * @param class-string<\Drupal\Core\Entity\EntityHandlerInterface> $class
       *   The handler class to instantiate.
       * @param \Drupal\Core\Entity\EntityTypeInterface $definition
       *   The entity type definition.
       *
       * @return object
       *   A handler instance.
       */
      public function createHandlerInstance(string $class, EntityTypeInterface $definition = NULL);
    

    (Even while writing this I had originally added \Drupal\Core\Entity\EntityHandlerInterface instances as valid type because the current code would work when it's provided a handler instance; however the code likely doesn't make sense in that way. That points me to the requirement for human judgement with an automated system likely encoding unwanted types that require more breaking changes later).

    I'm all in favor of starting to add better type annotations to existing interface arguments, regardless of our decision on PHPStan level :D However, I do think the above demonstrates unfortunately we probably need to do so in a non-automated fashion in reviewable chunks.

    With that said, since this is an issue close to my heart, I'm happy to volunteer as a first reviewer for anyone that sets up a PR like the above (easiest to ping me in the Drupal Slack).

  • πŸ‡¬πŸ‡§United Kingdom longwave UK

    I do agree with #57 that this needs significant manual review of each case; if we add argument types but later they turn out to be wrong (especially if they are too narrow) once they have propagated down that will be harder to fix than ensuring we get them right first time.

  • πŸ‡¬πŸ‡§United Kingdom alexpott πŸ‡ͺπŸ‡ΊπŸŒ

    With TranslatableMarkup - we nearly always want the object... we only want to do the translation if the string is printed to something a user is going to see.

  • πŸ‡¬πŸ‡§United Kingdom alexpott πŸ‡ͺπŸ‡ΊπŸŒ

    In fact this is true of anything using MarkupInterface - we nearly always want Twig to be the thing that converts it from an object to a string.

  • πŸ‡¬πŸ‡§United Kingdom longwave UK

    Yeah, and this is related to my point: if we start adding string types (because that is what the docblock says) when we should be using TranslatableMarkup, and a downstream class follows suit, then we can't widen the type on the interface later without causing errors.

  • πŸ‡¬πŸ‡§United Kingdom catch

    We should probably be type-hinting string|Stringable in those cases too rather than MarkupInterface.

  • πŸ‡¬πŸ‡§United Kingdom catch

    then we can't widen the type on the interface later without causing errors.

    It's possible to do this by removing the parameter from the interface altogether and adding it back later, but it's quite a painful process and we shouldn't set ourselves up to do it where we can avoid it.

  • πŸ‡¦πŸ‡ΊAustralia larowlan πŸ‡¦πŸ‡ΊπŸ.au GMT+10
  • πŸ‡ΊπŸ‡ΈUnited States xjm

    In πŸ› Using partially synchronized image fields causes validation errors and php warnings Needs work , contributors were adding return typehints to a new subclass that were not present on the parent. Since the subclass is new, it doesn't have any subclasses of its own to break; however, given how much Drupal allows altering everything, it is still possible that this could cause a disruption. I did an extensive evaluation of the use of the methods in contrib and determined that contrib, at least, would be unaffected by the changes in this particular issue. In general, only rare/edgecase usecases would be broken.

    I discussed this with @catch and @larowlan and we agreed that we will adopt a practice of requesting return typehints in new subclasses, and reverting the change in any rare cases that contrib breaks as a result.

  • πŸ‡ΊπŸ‡ΈUnited States xjm

    Fixing issue credit.

  • πŸ‡«πŸ‡·France andypost

    PHP 8.4 becoming stricter so after related it will be more cleaner

Production build 0.71.5 2024