[Meta, Plan] Pitch-Burgh: JSON field storage & JSON field schema support

Created on 26 June 2023, over 1 year ago
Updated 10 January 2024, 10 months ago

Problem/Motivation

This is the meta-issue for fulfillment of deliverables for the winning "Pitch-burgh" proposal shown here. It will be updated as subtasks are identified/completed.

Pitch-burgh is an initiative of the Drupal Association where the developer community voted to fund various projects designed to spark innovation in the Drupal open-source project. This proposal was among 7 finalists and was in the top 5 as voted by attendees to DrupalCon North America 2023. We are very appreciative of the support of the Drupal Association, the donors to the Pitch-burgh innovation fund, and the voters who supported this project in the competition.

Subtasks

  • Implement Generate JSON schema for content entity types Needs review , including any requisite policy decisions for core to use JSON-schema for normalized representation of field data structures. This is the core API that will enable contrib representations of these schemas. - Status: MR ready for review with test coverage.
  • Gain community/existing ecosystem maintainer consensus on issues summarized in #3258113-5: [META] Future of Schemata/Compatibility with other ecosystem modules . The goal is either new major versions of exisitng contrib schema modules or (more likely?) a new, greenfield module to represent core entity & field data exposed by JSON:API via OpenAPI 3.x. - Status: Schemata is no longer necessary due to the new core API. There are new major versions of openapi and openapi_jsonapi supporting OpenAPI 3.x. When the core support lands, we can tag a new major release.
  • Complete IS and engage community & relevant core maintainers on Add "json" as core data type to Schema and Database API Needs work and requisite child issues for Drupal to ship JSON field data type definitions for all supported database engines. Status: MR with test coverage in review.
  • Complete IS and MVP implementation of [META] Add support for JSON field queries in entity queries Active , querying JSON fields supported by core with entity query. Status: All this got rolled into the mega issue above because it was all so tightly coupled.
🌱 Plan
Status

Active

Version

11.0 🔥

Component
Entity 

Last updated 1 day ago

Created by

🇺🇸United States bradjones1 Digital Nomad Life

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

  • Issue created by @bradjones1
  • 🇳🇱Netherlands daffie

    I have talked about this pitchburg pitch @lauriii. I am the Drupal subsystem maintainer for the Database API. I have also done a pitchburg pitch on someting similar and I did not win. Let me state that I would very much like to help to get this functionality in Drupal core. All the problems that I have are not because I did not win! Again I would very much to get this functionality to land in core.

    What I understand from @lauriii is that the Drupal association is expecting that it all is done in about 6 months. For me this is totally unrealistic. Adding new functonality to low level layers in Drupal core takes a lot of time. The database layer part needs to go in before anything else and it is blocking all other work.

    Getting the database functionality to land has some problems. The first is JSON storage needs functionality from each by core supported databases. The problem for SQLite is that that functionality is added in SQLite 3.39. Only the minimum required version for SQLite in Drupal 10 is 3.26. Therefor we need to wait for Drupal 11. Drupal 11 will be released in the summer of 2024. Therefor we can start working on this in a year from now. Personally, I have never started working on JSON storage in Drupal because of this problem.

    As the maintainer of the database API, I want what lands to be long term maintainable. It needs to be good. Changing something on a low level layer is very diffecult, therefor it needs to be of a high standard. When something has landed in a release, we need to support it. We can only remove something by deprecation it and remove it in the next major version. I would much rather take some extra time and do it right.

    JSON storage needs to be queryable and indexable. When not correctly indexed the result will be a full table scan query. Those are very slow. As it is all related to each other, all database stuff should be done in a single issue. Creating JSON database fields, querying those JSON fields in the DBAL and having them be indexable. Oh, and it also needs to be documented.

  • 🇫🇮Finland lauriii Finland

    Discussed with @daffie and @longwave at length to try to provide some guidance on how to proceed here. The main issue is that as explained by @daffie, in practice 📌 Support JSON fields in Sqlite schema Closed: duplicate requires SQLite 3.39. Unfortunately we cannot bump the minimum SQLite minimum version in a minor release, so that will have to wait until Drupal 11

    Unfortunately, this means that we cannot commit a lot of the changes required for this to core until we open Drupal 11 for development, and commit the initial alpha blockers. This means that the first commits for this can't happen until at least November-December 2023.

    We could start the work in an issue that has a fork of Drupal that would be maintained by maintainers + subsystem maintainers, and review changes holistically on that issue. This would be preferred by @daffie as well because he would like to be able to provide some of those reviews.

    How does this work in terms of the timeline for DA and @bradjones1?

  • 🇫🇷France andypost

    I disagree with sqlite as blocker, we already have requirements check for json support!

    Existing and new sites which supposed to use sqlite has the extension for json installed already

    https://www.sqlite.org/json1.html

    The JSON functions and operators are built into SQLite by default, as of SQLite version 3.38.0 (2022-02-22). They can be omitted by adding the -DSQLITE_OMIT_JSON compile-time option. Prior to version 3.38.0, the JSON functions were an extension that would only be included in builds if the -DSQLITE_ENABLE_JSON1 compile-time option was included. In other words, the JSON functions went from being opt-in with SQLite version 3.37.2 and earlier to opt-out with SQLite version 3.38.0 and later.

    So the only thing changed in 3.38 https://www.sqlite.org/releaselog/3_38_4.html is that json1 extension is bundled and operators (->>) are added
    Moreover sqlite allows to create json table columns even without the extension

    I understand that it's bad idea to maintain 2 syntax for queries and json indexes are must have but let's split the job into parts.

    Just adding a database type without indexes already a win and start of adoption.
    Next is optimizations like indexes, it always has bugs and sometimes it takes more then decade to fix (like 1148856)

    For me it's a matter of documentation - if you wanna use sqlite fullpower then you need latest version

  • 🇺🇸United States dww

    Very much agree with #4 that SQLite support shouldn't be a blocker for this. Seems like it should be easy to conditionally support the feature if your SQLite install is new enough, even in 10.2.x or whatever.

  • 🇺🇸United States bradjones1 Digital Nomad Life

    I will agree-agree with #4 and #5. Basically, if you are implementing this on SQLite then you are in expert mode and can read the requirements report and Drupal should be configured to fail early if the requirements for the operation in question are not met. We have particular flexibility for this because this is all greenfield development; there's no BC consideration because there's no existing implementation to break.

  • 🇺🇸United States bradjones1 Digital Nomad Life

    Allow me to respond to #3 a little more in depth, there are some good issues here to get early agreement on:

    Discussed with @daffie and @longwave at length to try to provide some guidance on how to proceed here. The main issue is that as explained by @daffie, in practice #3373370: Support JSON fields in Sqlite schema requires SQLite 3.39. Unfortunately we cannot bump the minimum SQLite minimum version in a minor release, so that will have to wait until Drupal 11

    As touched on a little by my previous comment and also explicitly in #5, the JPTR syntax is pretty key for the functionality we're discussing here (ability to do queries on object subcomponents.) I'm not a core committer and this is an idea that certainly is up for debate, but since this is new functionality I don't think there's anything wrong with, in effect, feature-flagging the SQLite implementation of entity queries on JSON field subcomponents behind the appropriate minimum SQLite version. This is 100% greenfield functionality and if you want it, easy enough to upgrade if you're "still" on D10. The applicable core tests would run on the soon-to-be-rolled-out PHP 8.3 runner which should bundle the appropriate SQLite version - see #3365510: Create a DrupalCI Environment for PHP 8.3 .

    Unfortunately, this means that we cannot commit a lot of the changes required for this to core until we open Drupal 11 for development, and commit the initial alpha blockers. This means that the first commits for this can't happen until at least November-December 2023.

    I think this approach renders this concern mostly moot. The minimum MySQL and PGSQL drivers both support this, so you will have full test coverage there within the full spectrum of supported versions (and, let's be honest, the broadest set of databases actually implemented in production) and you'll cover SQLite in the test suite on at least 1 variant. We have the tools to roll this out in the D10 cycle. Even if the core team disagrees with all this reasoning and says it can only be done for D11... OK, fine. The timeline for this project extends well into Nov/Dec (we're still negotiating a contract, haven't even kicked off yet) and I think there is substantial and growing appetite to make this happen.

    We could start the work in an issue that has a fork of Drupal that would be maintained by maintainers + subsystem maintainers, and review changes holistically on that issue. This would be preferred by @daffie as well because he would like to be able to provide some of those reviews.

    Again, I think this is overkill. We'll have a CI runner that can run these tests on all supported drivers imminently, and I don't personally see the need for any forking complexity.

    How does this work in terms of the timeline for DA and @bradjones1?

    I'm eager for the DA to finish its contracting but again, I don't see anything here that holds this up in any significant way, if we can agree on a path forward (conditional support for this in SQLite being a minor issue, IMO.)

  • 🇦🇺Australia larowlan 🇦🇺🏝.au GMT+10

    👋 @lauriii asked me to chime in here.

    What would the API look like that #4, #5 and #6 are proposing?

    We throw unsupported operation exceptions for API usage that isn't supported because of the sqlite version?

    Feels pretty nasty, what if you're installing something via e.g. update module or project browser and the first time you know that a module you're using has this feature is a WSOD.

    Or alternatively it would require all calling code to try/catch for that.

    Or alternatively again we'd need a whole new API like Database::supports(DatabaseFeatures::JSON_STORAGE) and anyone using the new features would need to check that first. Possibly in hook_requirements in the INSTALL phase.

    Solving this problem don't seem like it should be the purview of the winning pitch, as it would eat a large amount of their time.

    So that probably does require the work to be done in a fork or similar, unless we can turn around something fast on that front.

    Can those advocating for conditional support expand a bit on how that might look?

  • 🇫🇷France andypost

    There's Database::getConnection()->hasJson() already and hook_requirements

    https://git.drupalcode.org/project/drupal/-/blob/11.x/core/modules/syste...

    So the question is minimal sqlite version we support to allow operators (-> and ->>) 📌 Support JSON fields in Sqlite schema Closed: duplicate

  • 🇺🇸United States bradjones1 Digital Nomad Life

    Thanks Lee for chiming in.

    What would the API look like that #4, #5 and #6 are proposing?

    I think it's a mix of an expanded status report message (which would now include support for the JPTR expression syntax) and warn there. In addition, I think it's totally reasonable for contrib modules to express a requirement for Drupal 10 >= the minor version this goes into; there are plenty of examples in the wild of contrib modules that already do this in their releases to incorporate new APIs as they are introduced... that's one already common layer of defense. Meanwhile these same modules can test the requirements at install time.

    We throw unsupported operation exceptions for API usage that isn't supported because of the sqlite version?

    I'll admit that my feelings on this might not reflect the same as the core team's comfort level, but... yeah. Since this is a greenfield addition and any _core_ implementation would test for support and more gracefully fail... I think failing hard and fast if contrib code doesn't properly test for support.

    Feels pretty nasty, what if you're installing something via e.g. update module or project browser and the first time you know that a module you're using has this feature is a WSOD.

    I am not an expert on project browser/in-place updating however I would imagine that the in-place updater needs to account for crashing during update... anyway?

    Or alternatively again we'd need a whole new API like Database::supports(DatabaseFeatures::JSON_STORAGE) and anyone using the new features would need to check that first. Possibly in hook_requirements in the INSTALL phase.

    This method already covers the data type, I think there would be a secondary check for the querying support.

    Solving this problem don't seem like it should be the purview of the winning pitch, as it would eat a large amount of their time.

    Agreed, however I think reasonable minds can find a way to help roll this out in the D10 cycle... IMO it's less about the sponsored development project and more about forging a policy that respects the stability concerns while also proving out how we can even more rapidly introduce new features like this in the future.

  • 🇺🇸United States bradjones1 Digital Nomad Life

    At the risk of being too optimistic, I think my findings in #3343634-23: Add "json" as core data type for Schema API render most of this conversation about requirements basically moot.

    TL;DR: The operation of -> and ->> between MySQL, SQLite and PostgreSQL are so inconsistent that we're unlikely to want to use them in constructing our SQL, and we don't need to. Particularly in the case of SQLite, the most predictable way to query JSON data is JSON_EXTRACT(), which is present from the beginning of their JSON support. No need to wring our hands about requiring any version that includes the -> operator.

    I'm kind of embarassed I didn't notice this earlier but this stuff is... complex. The linked issue's test suite really got me mucking around in the internals of each of these databases to see how different they really are. But, at their core, they all support querying by jsonpath. We'll need to still do a lot of work around what we want to support with respect to entity queries... but I think/hope this should unlock this part of the conversation.

  • 🇺🇸United States bradjones1 Digital Nomad Life
  • 🇺🇸United States bradjones1 Digital Nomad Life
  • 🇺🇸United States bradjones1 Digital Nomad Life
Production build 0.71.5 2024