Entity + Field + Property validation constraints are processed in the incorrect order

Created on 19 October 2016, almost 8 years ago
Updated 12 August 2024, about 2 months ago

Problem/Motivation

This issue was originally Comment-specific. But further investigation has revealed that the problem is not specific to comments. It's a general Entity/Field API problem. Entity constraints are validated before field constraints. See #12 and later.

Still, the Comment use case is a useful guide through the discovered problems:

While working on #2737719: EntityResource: Provide comprehensive test coverage: for every entity type, every format, every method , it took me hours to figure out what the minimal set of fields is that one must send for a Comment entity to be successfully created. See #2737719-75: EntityResource: Provide comprehensive test coverage: for every entity type, every format, every method .

In doing so, I had to do extensive debugging using Xdebug to figure out the incredibly obtuse error responses, which are categorically HTTP 500 responses instead of 422 responses. And not only that, but the message in each of those responses is absolutely worthless. It doesn't give any indication about what is wrong.

AFAICT key problems are:

  1. \Drupal\comment\CommentStatistics::update() depends on entity_type, which is neither required nor validated
  2. CommentNameConstraintValidator depends on entity_id and field_name, but runs before either of those can even possibly be validated (entity_id is required, but that's not validated before CommentNameConstraint is validated)
  3. Both of these result in failures outside of what the Entity API validation system is designed for, hence resulting in undecipherable errors in REST responses

Another use case was reported by @joachim in #59+#61, this time the problem is not "entity constraint validated before field constraint", but "field constraint validated before property constraint":

Over at Duration Field module, we have a field type with two properties:

- duration, which is a PHP duration string such as 'P1M1D'. This uses a custom data type.
- seconds, which is an integer count of seconds.

The duration data type has a constraint to check its value is a correctly-formed PHP duration string. So '1D' is not valid (must start with 'P'), 'P1X' is not valid ('X' is not a valid interval letter), 'cake' is not valid for obvious reasons. Setting it as deeply as on the data type makes sense, because the data type should enforce this wherever it's used.

I wanted to add a constraint to the field item as a whole, to enforce that the durations and the seconds values both match. In other words, prevent a field being saved with something like 'P1D' for the duration and '60' for the seconds (one day is not 60 seconds long!)

However, the constraint on the field item runs BEFORE the constraint on the duration property. So the field item constraint might run with badly-formed values for the duration property! It would therefore need to perform its OWN check on the format of the duration value, which is just repeating the work that the duration data type constraint is going to do again later on.

More generally, this problem will occur anytime you have a compound field type where:

- one or more of the field's properties properties need validation for their value alone
- the whole field item needs validation as a whole

Proposed resolution

Unknown, see comments.

In both use cases above, the observed pattern is: the more granular thing should be validated before the more coarse thing, because otherwise the data in the more coarse thing cannot be assumed to be valid, thus invalidating its validation 🤓

Remaining tasks

  1. Make Comment's entity_type base field have a validation constraint, to ensure a valid entity type is set. Test coverage must include a non-existing entity type to ensure good DX.
  2. Also ensure that \Drupal\Tests\rest\Functional\EntityResource\Comment\CommentResourceTestBase::testPostDxWithoutCriticalBaseFields() is no longer sending an Accept request header.
  3. … more, but yet TBD!

User interface changes

TBD

API changes

TBD

Data model changes

TBD

🐛 Bug report
Status

Needs work

Version

11.0 🔥

Component
Typed data 

Last updated 8 days ago

  • Maintained by
  • 🇦🇹Austria @fago
Created by

🇧🇪Belgium wim leers Ghent 🇧🇪🇪🇺

Live updates comments and jobs are added and updated live.
  • Triaged core major

    There is consensus among core maintainers that this is a major issue. Only core committers should add this tag.

  • Security improvements

    It makes Drupal less vulnerable to abuse or misuse. Note, this is the preferred tag, though the Security tag has a large body of issues tagged to it. Do NOT publicly disclose security vulnerabilities; contact the security team instead. Anyone (whether security team or not) can apply this tag to security improvements that do not directly present a vulnerability e.g. hardening an API to add filtering to reduce a common mistake in contributed modules.

  • Needs issue summary update

    Issue summaries save everyone time if they are kept up-to-date. See Update issue summary task instructions.

  • Usability

    Makes Drupal easier to use. Preferred over UX, D7UX, etc.

Sign in to follow issues

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

  • 🇬🇧United Kingdom jonathanshaw Stroud, UK

    It's not clear what the problem is that we are trying to solve here.

    Lower-level constraints are still being validated recursively even if high-level constraints report violations, so changing the sequence of validation will not lead to changes in what violations are reported for high-level validation. Nothing valid becomes invalid, or vice versa, if we change the order of execution.

    I can see 2 problems that are still relevant:

    The DX problem: Entity/field Validators have to be coded defensively, they cannot assume their fields/properties are themselves valid and in some sense this is not DRY. (see #61)

    The UX problem: Higher-level entity/field constraint violations are reported to users in addition to and prior to the lower-level field/property violation that is the root cause of the higher-level constraint failure; this could be seen as confusing - it is more helpful to focus the user's attention on the lower-level violation.

    Here's a proposal for addressing this without an elaborate system of explicit dependencies between constraints:

    1. Use try/catch around the call to each constraint validator, and add a violation "Could not verify '{$constraint_message'" if the validator blows up and throws an exception or error. This would mean that validators did not need to be coded defensively; they could generally assume lower-level data was valid.

    2. Only add this new kind of violation, if the child node validators do not add any violations. This would lead to the user typically only being exposed to a single violation message that correctly identified the root cause. The only drawback to this is that a poorly coded validator, which breaks not because lower-level data is invalid but because of some other unrelated bug, will not have its failure surfaced to the user until after any other lower-level violations are fixed. I think that's acceptable - bugs cause suboptimal UX by nature, we can't prevent that.

    3. Store/retrieve/display the violations in the opposite sequence to now. If we think it's often better UX for the lower-level violations to be reported first and acted on first, then we can just handle that either at the level of the violation internal storage, retrieval, or even display. We don't need to change the sequence of execution to address this.

  • 🇧🇪Belgium wim leers Ghent 🇧🇪🇪🇺

    The problem we're trying to solve here is two-fold:

    1. all of Drupal core, for both content and config, lumps all validation constraints together in one bucket
    2. all of Drupal core, for both content and config, pays no attention to the validation sequence: it always uses the same top-down (breadth-first, see \Drupal\Core\TypedData\Validation\RecursiveContextualValidator::validateNode()) recursive validation strategy, which results in the entity-level validation constraints not being able to assume valid field values and field-level validation constraints not being able to assume valid property values

    I suspect

    are part of the solution.

    \Drupal\Tests\jsonapi\Functional\ResourceTestBase has thorough test coverage, but it does NOT assert that the validation error messages in case of a nonsensical value (for example, the string hello world being specified for a boolean "status" field or an integer "weight" field).
    That test (and the REST equivalents) are specifically testing access control and basic shape of the representation, not detailed validation.

    I have recently started encountering this problem again, but now in the world of Config, because I've been working on config validation. There, it's likely that you'll see multiple errors for the same config property path, for example:

    [
      0 => "The '' plugin does not exist.",
      1 => "This value should not be null.",
    ]
    

    It goes without saying that the latter (generated by the NotNull constraint) should result in none of the other validators running, because there literally is no point in them running. The consequence: noisy, confusing validation errors.

  • 🇭🇺Hungary mxr576 Hungary

    Wow, wonderful thread! :) I learned a lot from reading this and I am also still unsure how many problems we have and what are those... :D

    I also encountered the problem in the past that in validators you have to use defensive coding or you have to "fail silently" in Validator A because Validator B will fail too and that is enough.

    I have checked the EmailValidator from Symfony 7.1 and they also use defensive coding there.

    It goes without saying that the latter (generated by the NotNull constraint) should result in none of the other validators running, because there literally is no point in them running.

    Maybe we should just forget stacking independent validator units and instead concrete and complex requirements should be covered by a specific validator... which goes against compos ability and (again) goes against what we can see in Symfony... BUT as it was highlighted and proved with historical context in #77, we already have a Drupalism in typed data validation due to the implementation does not support validation groups (which are essentially a solution for considering a set of independent validators as one validator unit).

  • 🇭🇺Hungary mxr576 Hungary

    Maybe we should just forget stacking independent validator units and instead concrete and complex requirements should be covered by a specific validator... which goes against composability

    Well there is https://symfony.com/doc/7.1/reference/constraints/Compound.html for that in Symfony, but that would still expose all validation error independently and maybe what I would expected a "composite" would have its own dedicated validation error that would be displayed when any of the validators failed.

    Again... preferences.

Production build 0.71.5 2024