PostgreSQL: Support JSON data type

Created on 16 April 2015, almost 10 years ago
Updated 27 July 2023, over 1 year ago

Problem/Motivation

I am trying to create a field type that leverages the JSON and JSONB data types in Postgres. My goal is to have a field that can store JSON objects who's properties can be indexed. This is a Postgres specific feature.

Proposed resolution

Add json:normal and jsonb:normal mapped to json and jsonb Drupal field types for pgsql driver, Schema::getFieldTypeMap().

      'json:normal' => 'json',
      'jsonb:normal' => 'jsonb',
      );
    return $map;
  }

json vs. jsonb
The json and jsonb data types accept almost identical sets of values as input. The major practical difference is one of efficiency. The json data type stores an exact copy of the input text, which processing functions must reparse on each execution; while jsonb data is stored in a decomposed binary format that makes it slightly slower to input due to added conversion overhead, but significantly faster to process, since no reparsing is needed. jsonb also supports indexing, which can be a significant advantage.
https://www.postgresql.org/docs/current/datatype-json.html

Remaining tasks

  • Review field type map and approach

Data model changes

json and jsonb field types are available for use in schema for pgsql driver.

Release notes snippet

Basically I want to be able to paste JSON or parse a CSV with another custom module and store the object in this field with the intent of being able to query this data, treating it like a NoSQL store. Since Mongo isn't fully implemented, and would require two databases running instead of one, Postgres seems like a natural fit.

✨ Feature request
Status

Closed: duplicate

Version

11.0 πŸ”₯

Component
PostgreSQL driverΒ  β†’

Last updated 3 days ago

No maintainer
Created by

πŸ‡ΊπŸ‡ΈUnited States kevinquillen

Live updates comments and jobs are added and updated live.
  • PostgreSQL

    Particularly affects sites running on the PostgreSQL database.

Sign in to follow issues

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

  • πŸ‡©πŸ‡ͺGermany Anybody Porta Westfalica
  • πŸ‡¨πŸ‡¦Canada gapple

    I commented in the MySQL issue ✨ Support JSON fields in MySQL schema Closed: duplicate , that MySQL has a single JSON type that is most similar to Postgres' jsonb, which could possibly cause some confusion or unexpected behaviour if Drupal's json:normal field type acts differently between database engines.
    The META issue ✨ Add "json" as core data type Active might be the place to discuss the expected DB behaviour (raw vs. normalized), and whether Postgres' convention should be adopted by Drupal. (The Postgres documentation does recommend that jsonb be used as the default, unless legacy needs require the json data to not be normalized).

  • Status changed to Needs review over 1 year ago
  • last update over 1 year ago
    29,804 pass
  • πŸ‡«πŸ‡·France andypost

    re-roll and changed json:binary for jsonb

    Also it could use pgsql_type in spec to provide this variation to keep it inline with Mysql - Re #49

    Patch also removes try/catch and ref to policy issue, added type-hints

  • last update over 1 year ago
    29,804 pass
  • last update over 1 year ago
    29,782 pass, 2 fail
  • last update over 1 year ago
    Custom Commands Failed
  • πŸ‡«πŸ‡·France andypost

    I removed jsonb as it could be used providing pgsql_type and to keep the driver inline with others

    Also copied the test to test every returned datatype from πŸ“Œ Support JSON fields in Sqlite schema Closed: duplicate

    The difference is that bool returned as "true" when sqlite returns 1 (so pgsql results are typed)

  • last update over 1 year ago
    Custom Commands Failed
  • last update over 1 year ago
    Custom Commands Failed
  • last update over 1 year ago
    Custom Commands Failed
  • last update over 1 year ago
    Custom Commands Failed
  • last update over 1 year ago
    Custom Commands Failed
  • last update over 1 year ago
    29,803 pass
  • πŸ‡«πŸ‡·France andypost

    fix CS

  • last update over 1 year ago
    29,802 pass, 1 fail
  • last update over 1 year ago
    29,803 pass
  • last update over 1 year ago
    29,803 pass
  • last update over 1 year ago
    29,803 pass
  • last update over 1 year ago
    29,803 pass
  • last update over 1 year ago
    Custom Commands Failed
  • πŸ‡«πŸ‡·France andypost

    simplified test

  • last update over 1 year ago
    Custom Commands Failed
  • last update over 1 year ago
    Custom Commands Failed
  • last update over 1 year ago
    29,803 pass
  • πŸ‡«πŸ‡·France andypost

    fix cs

  • last update over 1 year ago
    29,803 pass
  • last update over 1 year ago
    29,803 pass
  • Status changed to RTBC over 1 year ago
  • πŸ‡©πŸ‡ͺGermany Anybody Porta Westfalica

    Thanks @andypost for pushing this forward an unifying on json:normal

    The PostgreSQL documentation says about json vs. jsonb:

    The json and jsonb data types accept almost identical sets of values as input. The major practical difference is one of efficiency. The json data type stores an exact copy of the input text, which processing functions must reparse on each execution; while jsonb data is stored in a decomposed binary format that makes it slightly slower to input due to added conversion overhead, but significantly faster to process, since no reparsing is needed. jsonb also supports indexing, which can be a significant advantage.

    (https://www.postgresql.org/docs/current/datatype-json.html)

    I added that to the issue summary.

    As the implemented data type here is now the same as for the other database engines (see meta issue), I'll set this RTBC.

    But we should discuss, how to proceed with jsonb and potentially jsonpath (see https://www.postgresql.org/docs/current/datatype-json.html#DATATYPE-JSON...) for PostgreSQL.
    Should that simply be provided by a contrib module? Or as a follow-up here?

  • πŸ‡©πŸ‡ͺGermany Anybody Porta Westfalica

    Moving "Needs framework manager review" tag to the parent meta issue: ✨ Add "json" as core data type Active

  • Status changed to Needs work over 1 year ago
  • πŸ‡³πŸ‡±Netherlands daffie
    1. +++ b/core/modules/pgsql/src/Driver/Database/pgsql/Schema.php
      @@ -466,6 +466,8 @@ public function getFieldTypeMap() {
      +      'json:normal' => 'json',
      

      Lets use the JSONB field instead of the older JSON field.

    2. +++ b/core/modules/pgsql/tests/src/Kernel/pgsql/SchemaTest.php
      @@ -243,4 +243,57 @@ public function testPgsqlExtensionExists(): void {
      +    foreach ($test_data as $data) {
      +      $path = $data[0];
      +      $expected = $data[1];
      

      Can we rewrite this to foreach ($test_data as $path => $expected) {

    3. +++ b/core/modules/pgsql/tests/src/Kernel/pgsql/SchemaTest.php
      @@ -243,4 +243,57 @@ public function testPgsqlExtensionExists(): void {
      +    $query->addExpression('test_field -> :list -> 1', NULL, [
      +      ':list' => 'list',
      +    ]);
      

      Can we change this to: $query->addExpression('test_field -> 'list' -> 1');

    4. In the other 2 related issues the added testing has much overlap. Can we put that in a single test instead of duplecating the code.
    5. In the other 2 related issues the value of $test_data is different. Can we make the testing values the same?
  • πŸ‡«πŸ‡·France andypost

    I don't think we can use jsonb out of box because it doing normalization and removing duplicate keys (even in arrays)

    So if someone willing to use it they always can set it by pgsql_type

    Still not clear if we can provide alternative to addExpression() method

    +++ b/core/modules/pgsql/tests/src/Kernel/pgsql/SchemaTest.php
    @@ -243,4 +243,57 @@ public function testPgsqlExtensionExists(): void {
    +        $query->addExpression('[test_field] -> :path1 ->> :path2', NULL, [
    ...
    +        $query->addExpression('[test_field] ->> :path', NULL, [
    ...
    +    $query->addExpression('test_field -> :list -> 1', NULL, [
    

    ++ to move 2 arguments case out of loop to make it more clear

    but the biggest question is the last case where 1 (array index) can't be passed as argument and that's why I did hardcoding it into query

  • πŸ‡³πŸ‡±Netherlands daffie

    I don't think we can use jsonb out of box because it doing normalization and removing duplicate keys (even in arrays)

    I do not have a problem with removing duplicate keys. For me that is a plus. Doing normalization is also a plus for me. Far more important for me is the fact that you can do indexing on JSONB fields. Without an index we will be doing full table scans and that is a no-go for me.

  • πŸ‡ΊπŸ‡ΈUnited States bradjones1 Digital Nomad Life

    I would agree that indexing is a huge consideration here and is a strong +1 for jsonb. It is also more read performant, per the docs, despite being slightly slower on initial input.

    The normalization and de-duplicating array keys doesn't feel like a blocker; is there a practical example of where this would be an issue for current code?

    The point about being able to opt to the alternative implementation (json vs. jsonb) is a good one but I think whatever we go with needs to be a sensible default for the 90+% usecase. It wouldn't hurt also to add in some assertions for input data that could be triggered at least in development, e.g. if you attempt to store data that will be truncated (e.g., duplicate keys) if the default implementation is jsonb.

  • πŸ‡¨πŸ‡¦Canada gapple

    +1 for jsonb

    as I noted in #49:
    - the postgres docs recommend jsonb
    - mysql's json type doesn't preserve duplicate keys, does order object keys, and performs whitespace normalization

    I don't think removing duplicate keys should be a concern - any value in a PHP data type won't have duplicate keys, and if an input to json_decode has a duplicate key only the last value will be preserved (same behaviour as postgres & mysql).
    JSON objects are defined as being unordered, so any implementation using it shouldn't rely on order being preserved. It's probably only worth a documentation note that order preservation shouldn't be expected when storing and retrieving from json database fields.

  • Status changed to Closed: duplicate over 1 year ago
  • πŸ‡ΊπŸ‡ΈUnited States bradjones1 Digital Nomad Life

    Per #3343634-14: Add "json" as core data type for Schema API β†’ this is being consolidated into the parent issue for a single change.

    Also appears the jsonb change was made on the parent, resolving #63.

Production build 0.71.5 2024