Refactor the XB field type to be multi-valued, to de-jsonify the tree, and to reference the field_union type of the prop values

Created on 27 September 2024, 1 day ago

Overview

Currently, the XB field type is a single-item field with two columns: tree and props, each defined as a JSON structure.

📌 [PP-1] Consider not storing the ComponentTreeStructure data type as a JSON blob Active proposes to store the tree relationally, not as a JSON value.

Option #3 in the proposed resolution of JSON-based data storage proposal for component-based page building Active proposes to associate a field_union type with each component instance's static prop values in order to facilitate code outside of XB being able to use the information in the corresponding field_union config entities to implement Views integrations, migration tooling, configuring search indexes or other special view modes, etc.

This issue proposes to combine the de-jsonifying of the tree and the addition of the field_union type reference into a single refactor.

Proposed resolution

  • Change the field type from single-item to multi-valued. Each item would be for a single component instance.
  • Order (the delta column of) the items the same as how they appear in the left sidebar's Layers panel. This corresponds to how the component instances are ordered in the HTML when the page is rendered except where components render their slots in a different order than the order in which those slots are defined in the SDC's YAML.
  • Define the following columns (properties) in the field type:
    • instance_id (string)
    • component (string): Reference to the component config entity that defines the component that this is an instance of.
    • parent (string): The instance ID of the parent component in the tree. NULL for component instances that are at the top-level in the tree.
    • slot (string): The parent's slot that this component is in. NULL for component instances that are at the top-level in the tree or are in the default/unnamed slot of their parent (if in the future we decide to add support for default/unnamed slots).
    • data_sources (json): The sourceTypes and expression portion of what's currently in the props column (prior to this proposed refactoring) for this component instance. For example:
      {
        "prop1": {
          "sourceType": "dynamic",
          "expression": "ℹ︎␜entity:node:article␝title␞␟value"
        },
        "prop2": {
          "sourceType": "static:field_item:string",
          "expression": "ℹ︎string␟value"
        }
      }
      
    • static_values (json): The value portion of what's currently in the props column (prior to this proposed refactoring) for this component instance. For example, given the above example of data_sources, this could be:
      {
        "prop2": "Hello, world!"
      }
      

      The above example uses XB's current optimization of omitting the column/property name within the sourceType's field type if it's the sole property being used and it corresponds to the field type's mainPropertyName(). Given this issue's proposal to add a field_union reference (see below), we should evaluate if it would be better for people using that reference if we always explicitly included the column/property name, in which case the above would be:

      {
        "prop2": {
          "value": "Hello, world!"
        }
      }
      
    • field_union (string):
      Reference to the field_union config entity that defines the union of field types for this component. Alternatively, we could omit this from here and instead add the field_union reference to the component config entity. Since the component instance references the component config entity, this would just then be one more hop to get to the field_union config entity, but denormalizing the field_union reference into here might help with querying since Drupal doesn't have great support for JOINing on config entities (though that support could be improved if Drupal core refactored the config table to store as JSON instead of serialized PHP).


      Whether the field_union reference is in this field type directly, or only indirectly via component, it can be NULL in cases where static_values can't be, or wouldn't benefit from being, conformed to a field_union definition. Note that a field_union can be a union of fields that are of any type, including a JSON field, so the "wouldn't benefit from being" is more likely to be the case than "can't be". An example of this might be the values for block settings that we either can't or choose not to define field_union types for.

User interface changes

None

Risks

I ran this proposal by @Wim Leers before writing it up, and he pointed out that there could be nodes with hundreds of component instances on them, so this proposal creates the possibility of a multi-valued field item with more items in it than we're typically used to in Drupal, and we don't know what performance or memory issues FieldItemList and its related PHP objects will encounter at that scale. This is something we'll need to keep our eyes on, but I think there's two things that mitigate the risk:

  • I hope people don't actually put hundreds of component instances on a node. That wouldn't lead to a good authoring experience. A good design system, even if it includes some small components (atoms), should also include larger components (molecules, organisms) that content authors work with, so that content authors aren't in practice putting every atom one-by-one on a page.
  • If we need to, we can implement a list_class for the XB field type that's more suitable to very large lists than FieldItemList is.
📌 Task
Status

Active

Version

0.0

Component

Data model

Created by

🇺🇸United States effulgentsia

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

  • Issue created by @effulgentsia
  • 🇺🇸United States effulgentsia

    Crediting @joachim who proposed de-jsonifying the tree back in #3440578-51: JSON-based data storage proposal for component-based page building .

  • 🇺🇸United States effulgentsia

    Alternatively, we could omit [the field_union column] and instead add the field_union reference to the component config entity.

    A big advantage of this would be that it would allow the field_union module to be an optional dependency. XB could add the field_union config entities, and reference them from the component config entities, when the field_union module is enabled, and not do that when that module is not enabled, without that affecting the schema of the XB field type.

    denormalizing the field_union reference into [the XB field type] might help with querying since Drupal doesn't have great support for JOINing on config entities (though that support could be improved if Drupal core refactored the config table to store as JSON instead of serialized PHP)

    Given the advantage above of letting the field_union module be an optional dependency if we keep the field type normalized and only access an item's field_union config entity via its component config entity, I recommend doing that, and solving the querying use case by changing core's config table from serialized PHP to json.

  • 🇺🇸United States effulgentsia

    Added a Caveats section to the issue summary.

  • 🇬🇧United Kingdom catch

    Thanks for writing this up! I'm still digesting the proposed schema. Two minor things:

    so this proposal creates the possibility of a multi-valued field item with more items in it than we're typically used to in Drupal

    So I haven't actually used paragraphs, but I assume there's a single paragraph reference field that can have dozens if not hundreds of deltas in it referencing the paragraph entities. If so then we already have an equivalent example in the wild (except no extra entities for every row here). Also having written the last couple of sentences, I wonder if this starts to make an actual data migration from paragraphs more feasible.

    I recommend doing that, and solving the querying use case by changing core's config table from serialized PHP to json.

    We started discussing that in one of the JSON database support issues, it would allow us to remove the key value config stuff (which supports some limited querying now).

    However, I think we could workaround not having that yet, just by running extra queries. e.g. if we want to find out whether a field type is used, we can get a list of field unions that use it, then a list of components that use those field unions, then run an IN(). Given the main current use-case for that sort of querying is auditing, it should be OK.

  • 🇺🇸United States effulgentsia

    XB could add the field_union config entities, and reference them from the component config entities, when the field_union module is enabled, and not do that when that module is not enabled

    I realized after writing this that the reason this is true is because the component config entities have essentially all of the information that would need to be in the field_union config entity, which is what would let us generate the field_union config entity at any time that we needed to.

    Given that, I wonder if making the field_union module a hard dependency wouldn't actually be that bad. It would let us take a bunch of stuff out of the component config entity and instead move that information to the field_union config entity.

  • 🇪🇸Spain Carlitus

    Hi, I just wanted to comment on this:

    I hope people don't actually put hundreds of component instances on a node. That wouldn't lead to a good authoring experience. A good design system, even if it includes some small components (atoms), should also include larger components (molecules, organisms) that content authors work with, so that content authors aren't in practice putting every atom one-by-one on a page.

    We use a lot of low-level elements on a page, so we have a lot of freedom. Yes, we also have some elements like molecules, but we usually do that with templates that we can then modify. And this templates are a group a single atoms.

    And a landing, por example, can be very, very, very long.

    So actually the hundreds of components that @Wim Leers was talking about can be real in a lot of cases.

  • 🇬🇧United Kingdom catch

    It would let us take a bunch of stuff out of the component config entity and instead move that information to the field_union config entity.

    In general that sounds like a great idea, it would mean the component config entity only needs to hold the things that are unique to the concept.

    I had wondered whether we actually need two config entity types at all - i.e. could field union directly use a component config entity type instead of using its own, or could XB directly use field unions without an extra entity type in-between, but... no idea whether that would even be desirable even if it's possible.

  • 🇫🇮Finland lauriii Finland

    @catch thinks it's needed to support #3462219: [META] Support alternative renderings of prop data added for the 'full' view mode such as for search indexing or newsletters, but @lauriii thinks those use cases could be solved in better ways by XB directly.

    Are there use cases outside of the use cases that have been already identified that this would help with? So far I've not heard compelling reasons to do this. I'm pretty strongly -1 to supporting the workflow proposed in 🌱 [META] Support alternative renderings of prop data added for the 'full' view mode such as for search indexing or newsletters Active out of the box because at least as I understand it, it would result in a extremely convoluted UX. As a fairly technical user, I'm having hard time imagining working with several lists of components and figuring myself how to build anything meaningful out of it. I believe there should be an easier way for managing the challenges related to the search indexing.

    Unless we can define what's the value we get out of this, I don't see why we would prioritize working on this over other work, especially because it sounds like that there's risk associated to introducing this. If I also understand correctly, this also means that there's additional complexity going forward because we support multiple data models out of the box (one for config, one for content).

    I hope people don't actually put hundreds of component instances on a node.

    I checked a sample front page I had built on another page builder and I had 129 components/elements on that page. This was still a fairly simple page using a mix atoms and organisms. I would have to do some more research to define what a reasonable upper bound would be, but it seems that the architecture should definitely be able to handle at least some hundreds of components.

    Change the field type from single-item to multi-valued. Each item would be for a single component instance.

    If we move from JSON structure to a multi-valued field (where each delta represents a component), how do we handle scenarios where there are overrides on top of the desktop breakpoint (e.g. for the mobile breakpoint)? This is requirement #20 from the original product requirements for Experience Builder.

    Example scenario would be that I want larger margin and padding on desktop than on mobile and I want to display a block recommending to install an app on mobile.

    How would this be represented in this data model? Would this still all be stored in the single list or would we have separate lists for different breakpoints?

Production build 0.71.5 2024