Calculate field and component dependencies on save and store them in an easy to retrieve format

Created on 27 June 2024, 10 months ago

Problem/Motivation

📌 Prevent modules from being uninstalled if they provide field types used in an Experience Builder field Fixed fixed an issue where if a field was used in an XB field (i.e. static prop) it prevents uninstall of the module providing the field type.

However, this requires manually writing JSON expressions (JSON_EXTRACT) which is not yet well supported by all core database drivers. See https://git.drupalcode.org/project/experience_builder/-/commit/8888c2eb8...

I think it would be more robust if when saving the entity, we calculate the field (and component, because there's likely to be a similar issue with those) dependencies, and store them in a more predictable structure (like a flat JSON array or something) outside the tree structure. This would be easier to inspect manually for humans trying to debug, and should be be easier to support and more performant across database types.

Steps to reproduce

Proposed resolution

Remaining tasks

User interface changes

API changes

Data model changes

📌 Task
Status

Active

Component

Data model

Created by

🇬🇧United Kingdom catch

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

  • Issue created by @catch
  • 🇬🇧United Kingdom catch
  • 🇧🇪Belgium wim leers Ghent 🇧🇪🇪🇺
  • 🇬🇧United Kingdom catch

    Note that the 'field union json' approach on JSON-based data storage proposal for component-based page building Active would probably solve this problem for field types (the field union type would be in a relational varchar column even if the values are in JSON, allowing us to check dependencies via that), but I don't think it would solve it for components, which whatever we do almost certainly need to be in a nested JSON tree structure.

  • Status changed to Postponed: needs info 9 months ago
  • 🇧🇪Belgium wim leers Ghent 🇧🇪🇪🇺

    @catch

    1. Would you agree that this could be a fallback solution if 📌 Ensure querying JSON nested values when parent keys are unknown is possible in all supported databases Needs work ends up not working, i.e. an escape hatch?
    2. Would you agree that 📌 Ensure querying JSON nested values when parent keys are unknown is possible in all supported databases Needs work is preferable?
  • Status changed to Active 9 months ago
  • 🇬🇧United Kingdom catch

    @Wim Leers - short answer, maybe once it's been proved to scale to tens or hundreds of thousands of complex entities with dozens of revisions each on all supported databases? But if we don't build something that does that prior to a stable (or really, beta) release, then we are looking at having to update tens of hundreds of thousands of entities and their revisions to add the denormalization after the fact, which would not be good.

  • 🇧🇪Belgium wim leers Ghent 🇧🇪🇪🇺

    Good point. 🤔

    Your thinking is: it's gonna be easier to drop these denormalizations in the future, if we end up not needing them. Is that fair?

  • 🇬🇧United Kingdom catch

    That's one reason, but also:

    Let's assume that mysql, sqlite and pgsql can all run the queries needed in 📌 Ensure querying JSON nested values when parent keys are unknown is possible in all supported databases Needs work . Even if they can, they each have different syntax for this. They also have different ways of handling indexes on JSON fields.

    The database API does not have an abstraction layer for all of this, there is Add "json" as core data type Active but I don't know if that can handle nested values where the parent is unknown per the title of the other issue. Even if it can do that in the database layer, that concept currently doesn't exist in the entity query API (which is what we should be using for query entities) so would have to be added there too somehow.

    So that's two levels of abstraction for JSON queries that currently does not exist either in core or contrib. Equally, XB should not be hardcoding JSON query syntax itself, whether mysql-only, or switching on pgsql or sqlite, and it's not a good use case for backend-overridable either - because these are entity queries not some custom storage layer.

    Additionally, XB is currently the only use-case we have for adding arbitrary nesting support for JSON fields to entity query, but once it's in there we can't take it out again easily. So if all of that has to happen before it can be proven that it will scale to a medium-sized Drupal site (let alone a massive one), then it turns out it can't, that's an incredible amount of work wasted - and then we have a new API in core that we either have to support because other people started using it, or deprecate and remove again.

    I also feel like doing all of this is a slippery slope towards providing views support for arbitrary nested JSON values and other things that have been discussed in the JSON storage issue but personally I would very much like to avoid.

  • Assigned to bhuvaneshwar
  • 🇬🇧United Kingdom catch
  • 🇮🇳India bhuvaneshwar

    @catch, I'm experiencing some permission issues with git access. However, based on my findings, I suggest the following approach:

    In the widget, we can define the class as:

    class ExpressionFieldWidget extends WidgetBase {

    /**
    * {@inheritdoc}
    */
    public function formElement(FieldItemListInterface $items, $delta, array $element, array &$form, FormStateInterface $form_state) {
    $element['value'] = [
    '#type' => 'textfield',
    '#title' => $this->t('Expression'),
    '#default_value' => isset($items[$delta]->value) ? $items[$delta]->value : NULL,
    ];

    return $element;
    }
    }

    Then, we can define the field storage:

    function experience_builder_install() {
    // Define the field storage for the expression field.
    \Drupal::service('entity_type.manager')->getStorage('field_storage_config')->create([
    'field_name' => 'field_expressions',
    'entity_type' => 'node',
    'type' => 'expression_field',
    'cardinality' => FieldStorageDefinitionInterface::CARDINALITY_UNLIMITED,
    ])->save();
    }

    For querying the new field, we can use:

    $query = \Drupal::database()->select('node__field_expressions', 'e')
    ->fields('e', ['entity_id'])
    ->condition('e.value', '%' . $field_expression . '%', 'LIKE');
    $result = $query->execute();

    By transitioning to a standard Field API field that accepts multiple values, we can avoid database-specific JSON manipulation issues and ensure the code is compatible across different database systems.

    Please let me know your thoughts. Once I get the access issue resolved, I will apply these changes if everything looks good to you.

  • 🇬🇧United Kingdom catch

    I don't really understand storing the field expressions or needing a widget - it ought to be possible to store just the field type itself that is in use, and this can be calculated and put into the field value in entity presave.

    There will also need to be a field and appropriate logic for storing the components that are in use too.

  • Issue was unassigned.
  • Assigned to effulgentsia
  • 🇧🇪Belgium wim leers Ghent 🇧🇪🇪🇺

    So, @catch, are you essentially proposing this?

       /**
        * {@inheritdoc}
        */
       public static function propertyDefinitions(FieldStorageDefinitionInterface $field_definition) {
         $properties['tree'] = DataDefinition::create('component_tree_structure')
           ->setLabel(new TranslatableMarkup('A component tree without props values.'))
           ->setRequired(TRUE);
     
         $properties['props'] = DataDefinition::create('component_props_values')
           ->setLabel(new TranslatableMarkup('Prop values for each component in the component tree'))
           ->setRequired(TRUE);
     
         $properties['hydrated'] = DataDefinition::create('component_tree_hydrated')
           ->setLabel(new TranslatableMarkup('The hydrated component tree: structure + props values combined — provides render tree for client side or render array for server side.'))
           ->setComputed(TRUE)
           ->setInternal(FALSE)
           ->setReadOnly(TRUE)
           ->setRequired(TRUE);
     
    +    $properties['dependencies'] = DataDefinition::create('component_tree_props_dependencies')
    +      ->setLabel(new TranslatableMarkup('All config and modules that the stored values of this XB field depend upon..'))
    +      ->setComputed(TRUE)
    +      ->setInternal(TRUE)
    +      ->setReadOnly(TRUE);
    +
         return $properties;
       }
    

    (in \Drupal\experience_builder\Plugin\Field\FieldType\ComponentTreeItem::propertyDefinitions())

  • 🇬🇧United Kingdom catch

    @Wim that will still require querying a flat JSON structure if I'm reading correctly. Better than a nested one but still has potential scaling issues. It could also be a separate unlimited cardinality field that stores a string (obviously this then means you need both fields to make XB work). But yes something like that.

  • 🇺🇸United States effulgentsia

    My initial reaction is whether the denormalization proposed by @catch would get out of sync and lead to all the problems we've had over the years with file_usage. But presumably we could add code somewhere that can rebuild it fairly easily, which I think would alleviate that?

  • 🇬🇧United Kingdom catch

    @effulgentisa file_usage is a count of usages of a file, in a custom database table, across a large number of entities. There is no way to rebuild it because the source of the usage is not recorded, that's one of the reasons we had so many problems with it.

    The idea here is when saving a single entity, to save the dependency information to an additional field (or field property) on the entity itself. There is no way it could get out of sync unless someone ran direct updates against the field tables.

    If you look at config dependencies:

    Taking core/profiles/standard/config/install/field.storage.node.field_tags.yml

    langcode: en
    status: true
    dependencies:
      module:
        - node
        - taxonomy
    id: node.field_tags
    field_name: field_tags
    entity_type: node
    type: entity_reference
    settings:
      target_type: taxonomy_term
    module: core
    locked: false
    cardinality: -1
    translatable: true
    indexes: {  }
    persist_with_no_fields: false
    custom_storage: false
    

    Probably you could determine that this config depends on taxonomy, because the target_type is taxonomy_term, and that it depends on node, because the entity_type is node, and these could be traced back to the modules that provide those entity types, instead of having a dependencies array. But the dependencies key is an inherent property of the config entity, and it's updated every time it's saved, give or take manual editing of files and config import.

    Note that if 📌 Refactor the XB field type to be multi-valued, to de-jsonify the tree, and to reference the field_union type of the prop values Active happens, then the component would already be stored as a string in a multi-valued field. And then depending on 🌱 [META] Configuration management: define needed config entity types Active and similar issues (there might be a more relevant one I can't find), assuming the dependencies of the component are defined in config, not dynamically in the content itself, then all of the dependency information could be deduced from the component.

    e.g. - try to delete a field, check config dependencies of components, check whether content uses the component.

    Then this issue would have been done implicitly, without any explicit denormalization or duplication of data - but those issues do not have defined solutions or agreement yet, and the data loss and database cross-compatibility issues are still unresolved, so I think this issue would be a small step forwards if those are not being prioritised yet.

  • Assigned to wim leers
  • 🇧🇪Belgium wim leers Ghent 🇧🇪🇪🇺

    Let's do this as a stepping stone towards 📌 Refactor the XB field type to be multi-valued, to de-jsonify the tree, and to reference the field_union type of the prop values Active .

    I'm still collecting my thoughts, but there's at least 3 use cases already identified for adding more field properties (i.e. expanding ComponentTreeItem::::propertyDefinitions()) and hence DB columns:

    1. the dependencies implied by a component tree and its inputs
      👉 (this issue)
    2. multi-valued XB field with a unique exposed_slot property per delta, to target a specific exposed slot in a ContentTemplate config entity, i.e. for the per-content entity unique creative freedom while the overall component tree is guaranteed to be consistent by the ContentTemplate
      👉 being worked on at Content templates, part 3b: store exposed slot subtrees on individual entities Active , as part of 🌱 [META] 7. Content type templates — aka "default layouts" — clarify the tree+props data model Active
    3. (surfaced yesterday by @lauriii at DDD and discussed at length with @pdureau and others at DDD) the fact that some SDC props may need to be considered either translatable or not (see 📌 Allow Experience Builder fields to support Asymmetric and Symmetric translations Needs review ) — for example an image in some SDC might be considered translatable or not — details TBD
      👉 issue to still follow

    (Not linking any of those issues nor crediting those people, because that's what that follow-up will be for. I'm just writing it down as context, and as a reminder for myself when I do write that next week.)

    I will implement basically what I wrote in #15, but @catch was concerned in #16 that would still require a flat JSON structure.
    AFAICT he thought that because it'd store both config and module dependencies:

      config:
        - a
        - b
      module:
       - x
    

    But that's easily transformed to:

    config:a config:b module:x
    

    … at which point it's a flat list that is trivially queryable :)

    Given what @catch wrote in #18, I think that my thinking has arrived at the same point: we basically want for each row for an XB field revision a flat list of dependencies, that is functionally equivalent to what a config entity's dependencies stores. IOW: we want the equivalent of \Drupal\Core\Config\Entity\ConfigEntityInterface::calculateDependencies(), but for the XB field, to allow efficiently (and with 100% confidence — unlike file_usage) determining which XB fields (and even which revisions of XB fields) are impacted by a potential change in available modules/themes/config.

    IOW:

    so I think this issue would be a small step forwards if those are not being prioritised yet.

    💯

    P.S.: I'm no DB nor SQL expert, so perhaps it seems super naïve. That's fine! I just want to restart this conversation, @catch! 😄

  • 🇧🇪Belgium wim leers Ghent 🇧🇪🇪🇺

    wim leers changed the visibility of the branch 3457504-0-x to hidden.

  • 🇧🇪Belgium wim leers Ghent 🇧🇪🇪🇺

    Working PoC live, with plenty of @todos.

    Thoughts, @catch? 😊

  • 🇬🇧United Kingdom catch

    I will implement basically what I wrote in #15, but @catch was concerned in #16 that would still require a flat JSON structure.
    AFAICT he thought that because it'd store both config and module dependencies:

      config:
        - a
        - b
      module:
       - x
    

    But that's easily transformed to:

    config:a config:b module:x
    … at which point it's a flat list that is trivially queryable :)

    hmm this wasn't actually what I was concerned about, my concern was about trying to store multiple values in one column. I did not think about imploding the array into a literal string, but pretty sure this also will have scalability issues. How it would compare to a JSON query I don't know.

    The problem with the implode() is that MySQL LIKE queries can only use indexes from left to right until they hit a wildcard, so if you have an index on a string field, you can do a query like:

    SELECT * FROM bar WHERE baz LIKE 'something%' and that is fine.

    But if you need to get rows matching an arbitrary part of the string, like config:b in the example above the query is this:

    SELECT * FROM bar WHERE baz LIKE '%config:b%'

    And this does not use an index. Random blog post from google with more detail: https://planetscale.com/learn/courses/mysql-for-developers/indexes/index...

  • 🇧🇪Belgium wim leers Ghent 🇧🇪🇪🇺

    I just pushed a sequence of commits:

    1. 100% of prop expressions now implement calculateDependencies()
    2. Then I started using that to implement DynamicPropSource::calculateDependencies() with essentially zero effort, with test coverage to prove it works correctly.
    3. … at which point it was trivial to impose ::calculateDependencies() for all 4 of XB's "prop sources", by adding it to abstract class PropSourceBase, implementing it took only a few dozen LoC
    4. The above enabled me to drastically simplify ComponentInputs::calculateDependencies(): it now just forwards the responsibility of calculating dependencies to each prop source, and the (Static|Dynamic)PropSources largely forward that responsibility again to the low-level "prop expressions

    The result (assuming the test coverage is expanded): simple code, and the responsibility for ensuring correct dependencies is pushed down to the individual code that is actually interacting with those dependencies!

  • 🇧🇪Belgium wim leers Ghent 🇧🇪🇪🇺

    Next steps:

    1. review by @larowlan
    2. update Pattern::calculateDependencies() etc to leverage this new infrastructure
    3. update FieldTypeUninstallValidator(Test) to leverage this new infrastructure
    4. add missing low-level test coverage
  • 🇦🇺Australia larowlan 🇦🇺🏝.au GMT+10

    I think the next steps here are to perform some EXPLAIN queries against this data in JSON and string format and see whether we can query it efficiently. If there isn't an efficient way to query the multi-value nature of the field then I think a separate table makes more sense. SqlContentEntityStorage wraps hook_entity_insert and hook_entity_update in a transaction so we can ensure writes to that table are atomic. We can keep the logic @wim leers has written in the current MR but mark the fields as computed and use those hooks to update the dependencies table. We can make use of the new OO hooks to put the update/insert hook logic in a service and write tests for it. The service an make use of things like just-in-time table creation like we have for other services in core (i.e. no need to have a hook_schema)

  • 🇧🇪Belgium wim leers Ghent 🇧🇪🇪🇺

    @catch in #23:

    And this does not use an index.

    🤯 WTAF?! That is absurd!

    Thanks for super explicitly confirming my P.S. from #19:

    P.S.: I'm no DB nor SQL expert, so perhaps it seems super naïve. That's fine! I just want to restart this conversation, @catch! 😄

    … which you illustrated very well now 😅

    Which brings me to @larowlan's #26:

    I think the next steps here are to perform some EXPLAIN queries against this data in JSON and string format and see whether we can query it efficiently.

    I think @catch is saying the answer to that is "no". 😭

    I think a separate table makes more sense.

    … really? That's making things harder again. I'd really like to the data that belongs together, stored together. That'll make debugging much simpler.

    @catch, any ideas/suggestions? 🙏

  • 🇧🇪Belgium wim leers Ghent 🇧🇪🇪🇺

    Overhauled issue summary to outline the full plan in a fair amount of detail.

  • 🇧🇪Belgium wim leers Ghent 🇧🇪🇪🇺
  • 🇧🇪Belgium wim leers Ghent 🇧🇪🇪🇺
  • 🇧🇪Belgium wim leers Ghent 🇧🇪🇪🇺

    Solved a long-standing @todo in multiple config entity types by executing part of the plan! :D

    This now also explicitly surfaces what @larowlan predicted in his reviews: ContentTemplateValidationTest::testEntityIsValid() now fails with MissingHostEntityException.

  • 🇧🇪Belgium wim leers Ghent 🇧🇪🇪🇺

    This would also unblock Usage info for code components with unpublished changes Postponed !

  • 🇬🇧United Kingdom catch

    In the original issue summary you wrote: store them in a more predictable structure (like a flat JSON array

    So with LIKE '%foo%'; I know it performs badly. I do not know how well SQL queries on flat JSON arrays perform, so that would need benchmarks with sqlite, MySQL, and PostgreSQL to check, ideally with hundreds of thousands of rows of realistic-ish data.

    However: 'just an unlimited cardinality field API field) outside the tree structure' is 📌 Refactor the XB field type to be multi-valued, to de-jsonify the tree, and to reference the field_union type of the prop values Active as far as I'm concerned, unless that issue never happens in any form and it would have to be its own special field while everything else is in the JSON blob.

  • 🇧🇪Belgium wim leers Ghent 🇧🇪🇪🇺

    Thanks, that's very helpful! I'm just baffled that this is the reality in 2025 — where to get the simple thing you first kinda need to make it more complicated 😅

    So, to make absolutely sure we're on the same page, you think that/propose:

  • 🇬🇧United Kingdom catch

    Side note:

    I'm just baffled that this is the reality in 2025 — where to get the simple thing you first kinda need to make it more complicated

    The thing that makes this all easy is 'MongoDB', and there is [meta] Add database driver for MongoDB to Core as experimental Active , but we can't actually require MongoDB in core because we don't even support it yet let alone a migration path or widespread hosting support. I guess that is 'to make things easy you need to make them more complicated' too.

    I'm not entirely sure what #36 is proposing, so I'll try to summarise what I think, there are aspects I'm not sure about too so trying to also make those clear. Numbering to make it easier to reference:

    #1

    you think that/propose:

    this issue should lead by example and NOT store additional props, rather than waiting for 📌 Refactor the XB field type to be multi-valued, to de-jsonify the tree, and to reference the field_union type of the prop values Active

    I can't answer that, because I don't know why this issue is being worked on before 📌 Refactor the XB field type to be multi-valued, to de-jsonify the tree, and to reference the field_union type of the prop values Active or when that issue will be prioritised.

    Is this issue blocking that one? Is that issue blocked on other ones? Is this issue blocking other urgent issues? There is really not much visibility for me into why various issues get worked on or not at different times with Experience Builder. Both of these issues have been open for some time now. For me, if this issue can't be fixed properly without 📌 Refactor the XB field type to be multi-valued, to de-jsonify the tree, and to reference the field_union type of the prop values Active then there are two or three choices - implement a workaround that will definitely have to be thrown away later and redone, or postpone this issue on that one. I realise there's a lot going into this issue that's not the storage format and querying of that storage so there might be an in-between option too. I'm also not entirely sure how this issue ties into configuration for entity templates/displays (as opposed to xb entity data) so maybe dependency calculation for config could be done as an interim step before it's done for content?

    #2.

    We're likely to need a flat JSON structure for field values in 📌 Refactor the XB field type to be multi-valued, to de-jsonify the tree, and to reference the field_union type of the prop values Active (to support multiple cardinality component props), so that issue is not completely getting rid of the need for JSON storage, just the nesting. However the assumption in that issue and others is that querying the prop values is going to be very rare or not happen at all, so JSON in that case is being used as a serialization format, not for listing queries/dependency calculations.

    This means that at some point, XB is going to run up against Add "json" as core data type Active , but it might not run up against the performance of JSON conditions, or only in specific situations like site-specific views filters where someone made a choice that turned out to be wrong. Whereas dependency calculation is in the critical path, albeit not high frequency.

    So if we think that each 'component row' can express it's dependencies in a one or two single varchar column strings then it is probably worth avoiding the JSON condition performance question to the last possible point (and avoiding it entirely for dependency calculation). Sites that need to query XB component props with ultra-fast performance on large data sets will eventually be able to run on MongoDb (or punt something to search API views with solr). But any site running XB will need to do dependency calculations and those shouldn't take multiple seconds to look up.

    #3. From a pure storage/query performance perspective, a flat JSON structure might work out OK. The problem is we don't know, and it's the first time it would be used at scale in Drupal anywhere, and Add "json" as core data type Active is not done yet to make it easy to test. If we researched it now, it would give us information about how queryable the flat JSON structure is for prop data too, which would be useful information. So I don't think effort working on it would be wasted necessarily , it's just a question of do we (e.g. you :halo:) want to spend that effort given the unknowns. What I'm pushing for here is to understand the various trade-offs as early as possible to avoid nasty surprises later.

  • 🇧🇪Belgium wim leers Ghent 🇧🇪🇪🇺

    #37:

    #1

    I can't answer that, because I don't know why this issue is being worked on before

    Because I view this is a step that unblocks other things, whereas #3477428 is a HUGE change.

    So, I thought that this could be a nice step towards #3477428 — quoting yourself from #18: .

    Is this issue blocking that one? Is that issue blocked on other ones? Is this issue blocking other urgent issues? There is really not much visibility for me into why various issues get worked on or not at different times with Experience Builder.

    I can promise you it's super hectic inside our team too. 😬 It's constantly changing based on shifting priorities not decided by me; if it were up to me we'd be finishing one layer before building the next one on top of it. Alas.

    I'm working on a plan/meta issue for stabilizing the data model: 🌱 [META] Production-ready data storage Active . I see I didn't make that the parent of this issue yet — fixed. That meta issue culminates in 📌 Refactor the XB field type to be multi-valued, to de-jsonify the tree, and to reference the field_union type of the prop values Active . Why not start with it? Because the current data storage doesn't yet do all the things we need to do, and nor does #3477428. It's safer to first evolve what we have to do all the things we need to do rather than starting with the enormous change that is #3477428.

    Both of these issues have been open for some time now.

    Because the goal until recently has been "add features, show potential", not "stabilize". We just started stabilizing, towards 🌱 Milestone 1.0.0-beta1: Start creating non-throwaway sites Active .

    I'm also not entirely sure how this issue ties into configuration for entity templates/displays (as opposed to xb field data)

    Because XB's key data structure, "the component tree" is shared between content and config. They're stored differently (2 JSON blobs in a config YAML or 2 JSON blobs in a DB row), but they're internally, in memory, while executing logic, exactly the same. Search for the string ComponentTreeMeetRequirements and you'll find the same validation constraint used in both content and config! Look at the validation logic for that constraint, and you'll see that both end up working with a ComponentTreeItem object — yes, a field item instance. While certainly unusual, this ensures that we don't need to write equivalent (validation) logic twice.

    That's why the code changes in this issue seem to be about the XB field type solely, but they also impact config entities: because e.g. Pattern::getComponentTree() also just returns … a ComponentTree, conjured/instantiated from the two JSON blobs stored in config, and then we can simply use the same logic to determine the full set of dependencies. Because regardless of how component trees are stored, they ALL end up having dependencies on config, modules, etc.

    #2

    Can I summarize that as: flat JSON structure more so for ease of access, not for querying itself?

    #2

    I think this means I can summarize it as "let's try and find out".

    f we researched it now, it would give us information about how queryable the flat JSON structure is for prop data too, which would be useful information. So I don't think effort working on it would be wasted necessarily

    GREAT! 😄 That's exactly what I want us to do here, then. 👍

  • 🇧🇪Belgium wim leers Ghent 🇧🇪🇪🇺

    Per #38.3.

  • 🇬🇧United Kingdom catch

    Can I summarize that as: flat JSON structure more so for ease of access, not for querying itself?

    I think so but also if e.g. dependencies get stored in the same row as the flat json eventually, then it will result in easier querying of 'stuff in the row'. Should also be easier to query flat JSON than nested JSON - e.g. not having to do 📌 Ensure querying JSON nested values when parent keys are unknown is possible in all supported databases Needs work .

    #2 I think this means I can summarize it as "let's try and find out".

    Yeah that seems OK.

    Something I didn't spot in the MR yet was an uninstall validator or similar to use the new component dependency information (which makes sense given figuring out the storage format) but just to check what's in this issue vs. followups. e.g. if content depends on a component defined by module x, then there would need to be a ModuleUninstallValidator that checks the information being stored here and prevents uninstall. Same for themes, same for deletion of config-defined components. Is that all in scope here?

    There will need to be similar logic for config entities too, validators or onDependencyRemoval() etc.

  • 🇧🇪Belgium wim leers Ghent 🇧🇪🇪🇺

    but also if e.g. dependencies get stored in the same row as the flat json eventually, then it will result in easier querying of 'stuff in the row'

    My thoughts exactly! :D

    Something I didn't spot in the MR yet was an uninstall validator or similar to use the new component dependency information

    That is listed in the proposed solution:

    1. update FieldTypeUninstallValidator(Test) to leverage this new infrastructure

    Same for themes, same for deletion of config-defined components. Is that all in scope here?

    I think not; proving it for one thing is sufficient for this issue IMHO, because then at least the quite complex code that currently exists as a PoC should become simpler. That's enough proof for this MR I think.

    There will need to be similar logic for config entities too, validators or onDependencyRemoval() etc.

    Yep. And we have some issues for ::onDependencyRemoval() already:

    1. 📌 Handle update and delete of Block component config entities Active
    2. 🐛 Cannot delete JS components due to component depending on them Active
  • 🇧🇪Belgium wim leers Ghent 🇧🇪🇪🇺

    In #19 I wrote about which additional data we know we'll need to store in XB fields , and that I'd create a follow-up issue to capture all that.

    I did that, in this new meta issue: 🌱 [META] Production-ready data storage Active . Please read the overview, and the proposed resolution, especially the part.

  • 🇬🇧United Kingdom catch

    There is one more issue with the flat JSON arrays which is mentioned in #8 but not in more recent comments.

    Although a query on a JSON array can use an index, unlike LIKE %foo%, you still need to create the index, and core does not support this yet.

    Similarly, core does not support a condition syntax for querying JSON. All of the databases core supports handle JSON indexes completely differently, so it is not easy to workaround this at all.

    Support for both of these is added in Add "json" as core data type Active . So that core issue is likely to become a stable blocker for Experience Builder if this is the direction chosen.

    As an API addition, that only leaves about a month before the Drupal 11.2 rc. I've just been told in a slack channel that XB 1.0 stable release is intended for DrupalCon Vienna in October. A beta or rc release could depend on Drupal 11.3 which will be released in December, but a stable release can only depend on 11.2. That is a very short timeframe to finish JSON index and condition support. I do not know why that deadline has been chosen when issues like this are up in the air and would very strongly advise against committing to it any more than has already been done, vs. leaving some flexibility. Having to do major surgery to the database schema after a stable release would add a tonne of complexity and risk.

  • 🇦🇺Australia larowlan 🇦🇺🏝.au GMT+10

    #43 makes me think a separate table and computed properties is the easier path

  • 🇧🇪Belgium wim leers Ghent 🇧🇪🇪🇺

    Although a query on a JSON array can use an index, unlike LIKE %foo%, you still need to create the index, and core does not support this yet.

    Similarly, core does not support a condition syntax for querying JSON.

    Argh! 😬

    To keep the multi-DB-table-joined-against-fields complexity out of this issue/MR (unless @catch or @larowlan have a nice reference/sample to look at? 🤞), let's:

    1. keep this issue scoped to storing space-separated strings (with slow performance because no indexes) as additional columns
    2. a subsequent issue to convert it to a joined-table approach (with good performance because indexes)

    The intent for me in this issue was to really focus on getting the necessary information stored and be made queryable. Performance is crucial, but we don't need to do everything in a single MR.

    P.S.: @catch, I'm so grateful to have you in this issue!!! ❤️🙏🙏

  • 🇬🇧United Kingdom catch

    multi-DB-table-joined-against-fields complexity unless @catch or @larowlan have a nice reference/sample to look at?

    The closest model to this I can think of is the entity_usage module, it has a table, with the entity type, entity id, revision id, language + target type and target entity id (and a count in case it's referenced twice from the same place).

    It is not quite the same thing, but I think if the result here is a standalone table, then it doesn't need a computed field or join, it can just be a thing to put the data into, run queries against, and then get a list/count of entity IDs.

    Schema is here, although if I was implementing this in a service I'd use the lazy table creation pattern rather than hook_install() these days.

    https://git.drupalcode.org/project/entity_usage/-/blob/8.x-2.x/entity_us...

    However, I also think that splitting this issue into two is a good idea because there's a lot in here that's not the data model. And if we're splitting it into two, then would it makes sense to try to get a bit further on JSON-based data storage proposal for component-based page building Active which might get rid of the need for the custom table altogether? Might be able to skip it then.

  • 🇦🇺Australia larowlan 🇦🇺🏝.au GMT+10
  • 🇧🇪Belgium wim leers Ghent 🇧🇪🇪🇺

    Walked @penyaskito through this issue, next steps for this specific MR, and the big picture at 🌱 [META] Production-ready data storage Active .
    In having him start on

    1. update FieldTypeUninstallValidator(Test) to leverage this new infrastructure

    we basically immediately got stuck, because the necessary information is missing. So AFAICT there's no way around what I wrote in #25:

    P.S.: IMHO core's config dependency infrastructure is missing a type in addition to config, content, module and theme: plugin. A module still being present is a very weak guarantee for a plugin being present. I think XB should lead the way here, but I'm curious what @larowlan and others think.

    What I wrote was too rough/inaccurate though! 😅 The Component config entity has a PluginExists constraint for the field types XB's "prop shape matching" assigned to it:

    experience_builder.generated_field_explicit_input_ux:
    <snip>
              field_type:
                type: string
                label: 'Default field type'
                constraints:
                  PluginExists:
                    manager: plugin.manager.field.field_type
                    interface: '\Drupal\Core\Field\FieldItemInterface'
    

    So that already prevents uninstallation of a field type. But … that chosen field type can change in the Component config entity, and then existing revisions will not have been updated (we can't go in and update millions of rows). So for component trees in content entities, we need the plugin information to be explicitly queryable. See detailed comment.

    Otherwise we won't be able to remove

          // @todo The "$.*.*.expression" wildcard key matching works in Mariadb but not Sqlite.
          //   We need figure out a way to make this work in Sqlite and PGSQL in https://drupal.org/i/3452756.
          $select->where("JSON_EXTRACT($column_name, '$.*.*.expression') LIKE '%$field_expression%'");
    

    which is the goal here.

  • 🇧🇪Belgium wim leers Ghent 🇧🇪🇪🇺

    Note that even long-term plugin dependencies likely make sense, because they allow for better error messages that are more actionable for end users. It's likely to help with when e.g. a block plugin or an SDC disappears from a module (that is quoted from 🌱 [META] Robust component instance error handling Active ).

  • 🇬🇧United Kingdom catch

    So that already prevents uninstallation of a field type. But … that chosen field type can change in the Component config entity, and then existing revisions will not have been updated (we can't go in and update millions of rows).

    Just to confirm. Is this something that can currently happen in Experience Builder that this issue will prevent via validation?

    Or is this something that is supposed to happen that this issue won't prevent via validation, but must account for?

    If it's the latter, then I'm a bit confused what would happen when you restore one of these revisions because presumably it would mis-match?

  • 🇧🇪Belgium wim leers Ghent 🇧🇪🇪🇺

    Alright, extracting the "store in an efficiently queryable manner" part into 📌 [PP-1] Evaluate storing XB field type's "deps_*" columns in separate table Active .

  • 🇧🇪Belgium wim leers Ghent 🇧🇪🇪🇺

    #50: it's something that can currently happen if you implement hook_storage_prop_shape_alter() or if XB decides to change its default decisions in \Drupal\experience_builder\JsonSchemaInterpreter\SdcPropJsonSchemaType::computeStorablePropShape().

    Uninstallation of the module providing the field type that is:

    • currently referenced by a Component config entity is already validated, thanks to config dependencies.
    • currently in use in historically created component instances using a field type NOT matching the current Component config entity is already validated, thanks to \Drupal\experience_builder\FieldTypeUninstallValidator

    when you restore one of these revisions because presumably it would mis-match?

    They don't need to match. The component tree is entirely self-sufficient. It contains all the metadata for reconstructing a field item object out of thin air.

    For example, imagine a event SDC, which has a duration prop (shape: an object with two key-value pairs, named start and stop, both type: string, format: date). Then for that component instance, the following StaticPropSource is stored in the component tree:

    {"sourceType":"static:field_item:daterange","value":{"value":"2020-04-16T00:00","end_value":"2024-07-10T10:24"},"expression":"ℹ︎daterange␟{start↠value,stop↠end_value}"}
    

    From there, XB can reconstruct the field type, field storage settings, instance settings, (default) widget, etc.

    IOW: restoring old revisions would indeed bring back whatever field type was used at the time. That could mean that in theory you're restoring a revision using a field type that doesn't exist anymore. So we'll either need to prevent such a revision from being restored, or we'll let 🌱 [META] Robust component instance error handling Active handle that one component instance gracefully failing to render.

  • 🇬🇧United Kingdom catch

    They don't need to match. The component tree is entirely self-sufficient. It contains all the metadata for reconstructing a field item object out of thin air.

    For example, imagine a event SDC, which has a duration prop (shape: an object with two key-value pairs, named start and stop, both type: string, format: date). Then for that component instance, the following StaticPropSource is stored in the component tree:

    So if you restore the revision, then it is renderable by experience builder because there is sufficient metadata to render it, without the component configuration.

    But what if you then want to edit the entity and save it again - is there also sufficient information for that? And is that 'allowed'? (e.g. saving new revisions with what would not be allowed for new content?).

    IOW: restoring old revisions would indeed bring back whatever field type was used at the time. That could mean that in theory you're restoring a revision using a field type that doesn't exist anymore

    But isn't that what should be prevented via validation?

    I mean it's possible that a module would remove a field type from its code base without any kind of deprecation or upgrade path, but that's the fault of the module then and needs some kind of missing plugin error message (as can happen with Views etc.), but it shouldn't be possible to uninstall a field type.

  • 🇪🇸Spain penyaskito Seville 💃, Spain 🇪🇸, UTC+2 🇪🇺

    Fix adding deps_plugin to IS.

Production build 0.71.5 2024