Generate JSON schema for content entity types

Created on 7 February 2019, over 5 years ago
Updated 10 May 2024, 13 days ago

Drupal boasts a feature-rich API whereby content entities may contain attached data or references to other entities in fields. These are either "base" fields that are always present for a given entity type, or bundle fields which are used only on a subset of "bundles" for that entity. The storage configuration for a field is always the same for all instances of a particular entity type.

Drupal 8's release cycle saw the introduction of decoupled/API-first concepts on top of the core entity system, e.g. JSON:API and REST modules. In contrib, GraphQL is also popular, however it provides additional abstractions on top of the field API.

When working with Drupal entities over an API, it is very helpful to have a schema for the data structure of a particular entity. This allows clients to know, for instance, what acceptable values may be sent or received for the value and format properties of a formatted text field.

This issue's MR enhances the core typed data, field and serialization APIs to provide JSON Schema representations of a field's properties. These field-level schemas may then be used to generate comprehensive schemas for fielded entities, e.g. as OpenAPI specs. Currently we leave this level of schema generation to contrib, however it could make sense in the future to incorporate something like openapi_jsonapi into core. Currently, the jsonapi_openapi 4.x development branch depends on this MR and is a good window through which to review this issue's functionality.

This issue only covers content entities. Config entities have some support over the API, e.g. in JSON:API module, however schema discovery for them is very different given content entities are not fieldable in the same way, and their schemas would be derived from the config schema and validation APIs vs. typed data.

Some technical notes for review

While this issue/concept was originally blocked by an inability to cache the outcome of a "supports" query on a normalizer, that was fixed in #3252872: Use CacheableSupportsMethodInterface for performance improvement in normalizers thanks in large part to changes upstream in Symfony.

As it turns out, the original solution of a new interface and method to get a schema is not possible due to the fact the resolved normalizer may not be accessed directly from the serializer. A proposed change of Serializer::getNormalizer() was rejected upstream. The consensus alternate approach, which is probably more elegant anyway, is to use a new normalization "format" of json_schema to retrieve the schema, if supported.

Original Solution (originally authored by @gabesullice):

Add a SchematicNormalizerInterface with a ::getNormalizationSchema() method.

Initial thoughts on a method signature:

  • $type is a supported interface or class.
  • $format is the encoding format.
  • $refinements is a parameter bag of anything that is required to return a correct schema. For example, ResourceObjectNormalizer::getNormalizationSchema() would need $refinements->get('resource_type'). Best practice would be for refinements to be documented on the method and then asserted in the method. It's imperfect, but the best I can think of.

I think under this system, every normalizer would be required to return a complete schema. Meaning that the JsonApiDocumentTopLevelNormalizer would be responsible for returning a schema that included schema for any child resource object(s). Alternatively, we could allow normalizers to return placeholder objects and resolve them separately. That might end up as an over-engineered solution though.

Finally, I think that we would put this method on the Serializer service so that normalizers will not need to specifically know which child normalizer services will be applied.

Feature request
Status

Needs work

Version

11.0 🔥

Component
Serialization 

Last updated 13 days ago

Created by

🇫🇷France gabesullice 🇫🇷 ①③ UTC+2

Live updates comments and jobs are added and updated live.
  • Needs framework manager review

    It is used to alert the framework manager core committer(s) that an issue significantly impacts (or has the potential to impact) multiple subsystems or represents a significant change or addition in architecture or public APIs, and their signoff is needed (see the governance policy draft for more information). If an issue significantly impacts only one subsystem, use Needs subsystem maintainer review instead, and make sure the issue component is set to the correct subsystem.

  • Needs issue summary update

    Issue summaries save everyone time if they are kept up-to-date. See Update issue summary task instructions.

Sign in to follow issues

Merge Requests

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

  • 🇺🇸United States bradjones1 Digital Nomad Life
  • 🇫🇷France andypost

    The blocker is closed as outdated so IS needs update

  • 🇳🇱Netherlands bbrala Netherlands

    Think this is the issue, we can use the symfony property for that I think.

    #3252872: Use CacheableSupportsMethodInterface for performance improvement in normalizers

  • 🇺🇸United States bradjones1 Digital Nomad Life

    Updating IS.

  • last update 7 months ago
    30,396 pass, 1 fail
  • 🇧🇪Belgium Wim Leers Ghent 🇧🇪🇪🇺

    👀👀👀👀

  • last update 7 months ago
    Custom Commands Failed
  • last update 7 months ago
    Custom Commands Failed
  • last update 7 months ago
    30,434 pass
  • Status changed to Needs review 7 months ago
  • 🇺🇸United States bradjones1 Digital Nomad Life

    Changing the component to serialization because this is mostly not specific to JSON:API.

    I am marking NR because I'm deep enough into this now that I would really like some maintainer/committer review to ensure this isn't totally off-base. I'm personally pretty proud of how this is implemented with the interface and the mostly-automatic integration of the base normalizer with the new Attribute.

  • last update 7 months ago
    30,434 pass
  • last update 7 months ago
    Custom Commands Failed
  • last update 7 months ago
    30,438 pass
  • last update 7 months ago
    Custom Commands Failed
  • last update 7 months ago
    Build Successful
  • last update 7 months ago
    Composer error. Unable to continue.
  • last update 7 months ago
    Build Successful
  • 🇺🇸United States bradjones1 Digital Nomad Life

    So, yeah, this now requires a change to symfony/serializer which is a bummer, and I'm not quite sure what the work-around would be. I've chosen to just include the patch for now, in hopes that as we work on this that either 1) Symfony will make a minor change to two methods from private to protected, or 2) someone smarter than I comes up with a different alternative.

    The issue is that we are adding a new method to normalizers to express information about their normalization, but the selection of said normalizer is hidden behind the serializer's ::normalize() method, where it finds the correct normalizer from all those registered. We need to be able to essentially statically-analyze the selected normalizer. We need to peer into ::getNormalizer(), which is currently private.

    There is a dirty hack to invoke private methods with reflection, but I doubt that would pass a core quality gate and makes me feel very dirty.

    Still leaving NR because I would love feedback on this and the overall approach. Not letting this slow down development, but it is a sticking point.

  • last update 7 months ago
    Build Successful
  • last update 7 months ago
    Build Successful
  • 🇺🇸United States bradjones1 Digital Nomad Life

    This should now be to the point of producing a spec-compliant schema for a resource object. There's plenty more work to be done, but I'm pretty happy with how relatively easy it was to implement the earlier generic work on schematic normalizers to JSON:API in basically an afternoon.

    Re: our need for the Symfony Serializer to change two methods from private to protected, I've received some initial feedback from two Symfony maintainers and both asked good questions for follow-up.

    The concrete bad news is that Symfony 6.4 and 7.0 are in feature-freeze, so the earliest this change would be accepted is 7.1.

    Parallels were drawn between what we're doing and api-platform, which is an official-ish reference implementation of Symfony for doing fancy API things. They are generating JSON Schema using object reflection, and I laid out how this isn't much of a parallel to the Drupal entity and field APIs. We'll see if anyone's convinced by my reasoning enough to change the method visibility.

    So optimistically, this would be blocked on Symfony 7. If this is a non-starter, there are two other options:

    1. Most radical: Ditch the Symfony serializer/normalizer because we barely use it now and abuse it when we do. This would be a major lift/BC break, but it would also alleviate a lot of headaches encountered in the D8+ lifecycle.
    2. Fork the Symfony Serializer and implement our own which implements SerializerInterface. This increases maintenance overhead and we are forked further from Symfony core, which we're trying to avoid. But there is precedence, e.g. we have our own YamlLoader.
    3. (Ab)use NormalizerInterface::normalize()'s context parameter and allow normalizers to return a value object containing a schema. There is some precedence for returning a value object as JSON:API module needs cacheability metadata and so it passes around CacheableNormalizations. This is a bit of an escape hatch if we get cornered by other options. The downside is that is makes ::normalize() even more polymorphic and magical and also implies the value you pass is "real data," instead of potentially being "just" a supported class or interface name, which is all we might have during schema generation. Then again, that parameter is already mixed and so we could just define by convention that if the special context flag is passed, the value is to be regarded as the same as to ::supportsNormalization(). We could also use this approach to start and then deprecate it in favor of a more explicit method if we can ever make it publicly callable.
  • last update 7 months ago
    Build Successful
  • last update 7 months ago
    Build Successful
  • last update 7 months ago
    Build Successful
  • 🇳🇱Netherlands bbrala Netherlands

    First off, im so happy to see you working on this issue. :)

    I've been tring to wrap my head around this and am having a hard time hehe.

    One of the things we might be able to do is perhaps use the serializer to keep track? We already overwrite the contructor for that and pass that up to the parent. Perhaps there is a way to add the required data to the normalizers there using our own interface. This also feels very tacky in a way, but could be a place where we can track that information and allow it to be pulled in the normalizer itself perhaps.

    This is purely based on looking at code, have not tried and see if that would work and how that would look.

    The suggestion to use another format for the json_schema in the github issue and go from there is also interesting, wouldn't that work? it might not be the fastest of things to serialize again and go through all the paces to get the schema and would add another whole normalization pipeline, but it could be cached pretty well I guess. This does seem to combine pretty well with the 'describeby' member that was added in jsonapi 1.1 though, which is a link. It could then be a link like '/jsonapi/node/article?format=json_schema'.

    Basically, i have no real awnser here right now. I'm not convinced we will see Symfony change the visibility right now. Hopefully my thoughts are helpfull in a way.

  • Status changed to Needs work 7 months ago
  • 🇺🇸United States bradjones1 Digital Nomad Life

    Thanks @bbrala for the feedback. Sorry I couldn't make it to Lille this year to work on this in person! We got a lot done last time we pair programmed in Portland.

    One of the things we might be able to do is perhaps use the serializer to keep track? We already overwrite the contructor for that and pass that up to the parent. Perhaps there is a way to add the required data to the normalizers there using our own interface. This also feels very tacky in a way, but could be a place where we can track that information and allow it to be pulled in the normalizer itself perhaps.

    This is purely based on looking at code, have not tried and see if that would work and how that would look.

    I think I follow what you're saying, and in theory yes we could do a lot of shenanigans within the @internal portion of JSON:API module. However there are two main gotchas to this approach: 1) despite these methods not being "technically" extend-able, modules like Extras violate that with imposters and so any normalizers from modules like that, while violating the internal rule, would need to also change. And 2) this would only address JSON:API's normalizers, but not those in the core typed data API, which provides all the schema for properties.

    The suggestion to use another format for the json_schema in the github issue and go from there is also interesting, wouldn't that work?

    I ran into your feedback while coming here to say that yes, I think this suggestion for treating JSON Schema as another "format" is going to be the path forward.

    ...it might not be the fastest of things to serialize again and go through all the paces to get the schema and would add another whole normalization pipeline, but it could be cached pretty well I guess.

    Unless we're talking about different things here, and I don't think we are, I don't see a performance hit. The approach with ::getNormalizer() would depend on the same normalizer resolving process that was optimized recently with the help of upstream changes, and this is just another path for resolving a normalizer but for a different $format. Also as it stands I don't believe that performance should be a huge issue here, as the resulting schemas could be cached and won't vary much.

    This does seem to combine pretty well with the 'describeby' member that was added in jsonapi 1.1 though, which is a link. It could then be a link like '/jsonapi/node/article?format=json_schema'.

    That's an angle I hadn't explored yet, however I think it's really powerful and would open the door to a very low-code way for us to bring the functionality that currently lives in jsonapi_schema module into core. I almost wonder if that shouldn't be the near-term goal vs. the OpenAPI integration, though OAI is basically just formatting the same data in a slightly different way, so these are not in competition.

    Basically, i have no real awnser here right now. I'm not convinced we will see Symfony change the visibility right now. Hopefully my thoughts are helpfull in a way.

    Yeah, after the third maintainer basically said "no," I agree, and honestly this is how the process should work. I iterated on the initial idea (an interface for the normalizer), it turns out that won't work and is blocked on upstream cooperation anyway, but we land on a solution that might even be "better."

    I am toying with different ways of implementing this, however, that would still perhaps leverage an interface and/or trait to do most of the heavy lifting for the normalizers. I have some ideas that I'll toy around with in the MR as I refactor out of this solution and into the alternative for $format.

    Marking NW as I only really had it in NR to solicit this kind of help.

  • last update 7 months ago
    Custom Commands Failed
  • 🇺🇸United States bradjones1 Digital Nomad Life
  • last update 7 months ago
    Custom Commands Failed
  • last update 7 months ago
    30,384 pass, 8 fail
  • last update 7 months ago
    30,333 pass, 17 fail
  • last update 7 months ago
    Custom Commands Failed
  • last update 7 months ago
    30,370 pass, 12 fail
  • 15:01
    11:52
    Running
  • last update 7 months ago
    30,451 pass, 1 fail
  • Pipeline finished with Canceled
    7 months ago
    #35492
  • Pipeline finished with Success
    7 months ago
    Total: 10514s
    #35500
  • last update 7 months ago
    30,452 pass
  • Status changed to Needs review 7 months ago
  • 🇺🇸United States bradjones1 Digital Nomad Life

    Putting this back to NR as I would still love feedback from maintainers and committers.

    The MR now contains a refactor of the initial approach, which I think overall is an improvement. TL;DR, you now request schema from the normalization system by specifying the $format as json_schema.

    The only real nit I can see with this approach is that normalizers are resolved based on the data to be normalized and the format, and so there is theoretically a different universe of normalizers selected for schema vs. a specific format. That's pretty much unavoidable with this approach. In theory we could get 100% the same normalizers by specifying the format same as the eventual normalization and hinting in $context that we actually want the schema. (This is in spirit what we wanted by having access to ::getNormalizer().)

    I don't love this but don't hate it, in theory. The practical reason we can't take this approach is that normalizers which don't care about the $format (which is actually the majority of them) need to be aware of that context, and might return a "normal" normalization instead of schema. We have to address this as it is with this change to NormalizerBase::checkFormat() to special-case the json_schema format as an exception to the "if I don't specify any formats, I serve them all" default.

    Another question would be if the JSON:API implementation should wrap schema in CacheableNormalizations or not.

  • 🇺🇸United States bradjones1 Digital Nomad Life

    One thought about type-safety on the returned value (since we are using ::normalize() for schema as well as normalizations) could be to require it to specify the JSON Schema meta-schema it implements in $schema, which would be a quick check for the calling code to say "this is for sure a schema." I'm not sure if that's necessary, but one idea to throw out there if people are worried about non-core normalizers somehow getting tricked into returning a normalization when we really want a schema.

  • last update 7 months ago
    Custom Commands Failed
  • Pipeline finished with Failed
    7 months ago
    Total: 216s
    #37186
  • last update 7 months ago
    Custom Commands Failed
  • Pipeline finished with Failed
    7 months ago
    #37195
  • last update 7 months ago
    30,419 pass, 4 fail
  • Pipeline finished with Failed
    7 months ago
    Total: 606s
    #37196
  • last update 7 months ago
    30,462 pass
  • Pipeline finished with Success
    7 months ago
    Total: 609s
    #37202
  • last update 7 months ago
    30,464 pass
  • last update 7 months ago
    30,464 pass
  • Pipeline finished with Canceled
    7 months ago
    Total: 100s
    #37880
  • Pipeline finished with Success
    7 months ago
    Total: 788s
    #37886
  • Pipeline finished with Failed
    7 months ago
    Total: 598s
    #46439
  • 🇺🇸United States bradjones1 Digital Nomad Life

    Making title less technical and more accurate as to the current goals.

  • Pipeline finished with Success
    5 months ago
    #71865
  • Pipeline finished with Success
    5 months ago
    Total: 779s
    #71932
  • Pipeline finished with Success
    5 months ago
    #72861
  • Pipeline finished with Failed
    5 months ago
    Total: 633s
    #72912
  • Pipeline finished with Canceled
    5 months ago
    Total: 373s
    #73246
  • Pipeline finished with Success
    5 months ago
    Total: 636s
    #73249
  • Pipeline finished with Failed
    5 months ago
    Total: 657s
    #73254
  • Pipeline finished with Success
    5 months ago
    Total: 1454s
    #73735
  • Status changed to Needs work 4 months ago
  • The Needs Review Queue Bot tested this issue. It no longer applies to Drupal core. Therefore, this issue status is now "Needs work".

    This does not mean that the patch needs to be re-rolled or the MR rebased. Read the Issue Summary, the issue tags and the latest discussion here to determine what needs to be done.

    Consult the Drupal Contributor Guide to find step-by-step guides for working with issues.

  • Pipeline finished with Success
    4 months ago
    Total: 604s
    #76806
  • Status changed to Needs review 4 months ago
  • 🇺🇸United States bradjones1 Digital Nomad Life

    Good bot.

  • 🇺🇸United States bradjones1 Digital Nomad Life
  • 🇺🇸United States bradjones1 Digital Nomad Life

    Config schema generation is very different and a path forward for them is not very clear yet. The recent work on validation for config entities will help unlock this in the future, however.

  • Status changed to Needs work 4 months ago
  • The Needs Review Queue Bot tested this issue. It no longer applies to Drupal core. Therefore, this issue status is now "Needs work".

    This does not mean that the patch necessarily needs to be re-rolled or the MR rebased. Read the Issue Summary, the issue tags and the latest discussion here to determine what needs to be done.

    Consult the Drupal Contributor Guide to find step-by-step guides for working with issues.

  • 🇭🇺Hungary Gábor Hojtsy Hungary

    Is there a high level summary for this issue? I can't tell from the huge changeset what exactly is being proposed here. What's the before/after? "Content entities were serialized in BLOBs before (in which cases?) but now they are serialized in JSON?" What's the benefit? Which APIs are affected, etc? Especially of an issue of this magnitude I think it would be important to outline these. It should help with reviews as well :)

  • 🇧🇪Belgium Wim Leers Ghent 🇧🇪🇪🇺

    +1 to what @Gábor Hojtsy said. Change records would be really helpful to understand this functionality too. 😇

  • 🇺🇸United States bradjones1 Digital Nomad Life
  • Pipeline finished with Failed
    3 months ago
    Total: 208s
    #107215
  • Status changed to Needs review 3 months ago
  • 🇺🇸United States bradjones1 Digital Nomad Life

    Draft CR added. IS updated. MR rebased. Back to NR.

  • Pipeline finished with Success
    3 months ago
    Total: 551s
    #107234
  • 🇺🇸United States bradjones1 Digital Nomad Life
  • Status changed to Needs work 3 months ago
  • The Needs Review Queue Bot tested this issue. It no longer applies to Drupal core. Therefore, this issue status is now "Needs work".

    This does not mean that the patch necessarily needs to be re-rolled or the MR rebased. Read the Issue Summary, the issue tags and the latest discussion here to determine what needs to be done.

    Consult the Drupal Contributor Guide to find step-by-step guides for working with issues.

  • Status changed to Needs review 3 months ago
  • 🇺🇸United States bradjones1 Digital Nomad Life

    Conflict was from the conversion of typed data plugins from annotations to attributes 💯

  • Pipeline finished with Failed
    3 months ago
    Total: 490s
    #114107
  • 🇺🇸United States bradjones1 Digital Nomad Life
  • Status changed to Needs work about 2 months ago
  • The Needs Review Queue Bot tested this issue. It no longer applies to Drupal core. Therefore, this issue status is now "Needs work".

    This does not mean that the patch necessarily needs to be re-rolled or the MR rebased. Read the Issue Summary, the issue tags and the latest discussion here to determine what needs to be done.

    Consult the Drupal Contributor Guide to find step-by-step guides for working with issues.

  • 🇺🇸United States bradjones1 Digital Nomad Life

    Rebased. Back to NR.

  • Status changed to Needs review 19 days ago
  • 🇺🇸United States bradjones1 Digital Nomad Life
  • Pipeline finished with Failed
    19 days ago
    Total: 164s
    #163994
  • Pipeline finished with Failed
    19 days ago
    Total: 685s
    #164017
  • Pipeline finished with Failed
    19 days ago
    Total: 757s
    #164370
  • Pipeline finished with Success
    19 days ago
    Total: 502s
    #164389
  • 🇳🇱Netherlands bbrala Netherlands

    I went through the MR and have some comments. All and all it is a good implementation of what we talked about (quite) a while back.

    Change record is available.
    Issue summary seems up to date, although format is a little more freeform, would prefer to move to default setup. So keeping that tag.

  • Status changed to Needs work 13 days ago
  • 🇳🇱Netherlands bbrala Netherlands
Production build 0.69.0 2024