JSON-based data storage proposal for component-based page building

Issue created by @effulgentsia
Comment about 1 year ago →
🇺🇸United States effulgentsia
Comment about 1 year ago →
🇦🇺Australia larowlan 🇦🇺🏝.au GMT+10
This sounds great, amazing work
I also think there's an upgrade path for the existing layouts - as you mention regions are slots, blocks are components

Nice one
Comment about 1 year ago →
🇺🇸United States kevinquillen
I agree with most of this as we are taking a closer look at Layout Builder, but on this point:

There's also the question of should paragraphs or layout builder inline blocks even be their own entities to begin with. They were modeled that way in order to make them fieldable, and fields can only be added to entities, but modeling them as entities comes with its own baggage (performance issues from them having their own CRUD hooks, highly nested database transactions when saving, having them show up as separate JSON:API resources from the node or needing special code to avoid that, etc.).

Yes, I think they should. If they aren't entities, you lose a lot of the power of Drupal provides (adding Fields, alteration, forms, events, controlling form display etc) which is a major pain point when using any other kind of Drupal page builder that doesn't have that. Should they be blocks is another question. Should Blocks evolve toward something like https://www.drupal.org/project/storage → ?
Comment about 1 year ago →
🇬🇧United Kingdom catch
@kevinquillen take a look at https://www.drupal.org/project/field_union → , it allows a 'compound field' to be defined, which is a multi-column field made out of individual field types (say text and boolean, or text and media, or user entity reference and term entity reference). The issue summary doesn't say it explicitly, but I'd assume the idea would be to do that here too (except stored in JSON).

@effulgentsia

For the "components" field, instead of hashing the entire JSON object, hash each component instance separately, so that what's stored in the components field is a JSON object containing each component instance ID mapped to a hash of the JSON-encoded props for just that component instance. This way, if a page has 50 component instances, and a given revision only changes the props on one of them, the hash/lookup for each of the unchanged 49 gets to be reused.

I'm wondering - is it really that bad if the component field is stored as a large JSON string?

The problem with paragraphs and blocks is less the amount of data in the database, but that when you save the entity form, you have to then update potentially 40-60 other entities all of which have their own field tables and hook invocations etc.

If it's a single JSON field, then it's just one blob in the database and none of that happens. It also means that on load, it's a single lookup instead of... however we'd get the content back out of the db from the hashes.

It does mean more duplication of data but that doesn't seem any worse than field values are now. I guess it would mean the one table would get huge, as opposed to lots of smaller field tables that get big but are split up.

Also wondering how this would handle say a text being deleted in a revision, but then re-added with the same content two revisions later?
Comment about 1 year ago →
🇺🇸United States kevinquillen
it allows a 'compound field' to be defined, which is a multi-column field made out of individual field types (say text and boolean, or text and media, or user entity reference and term entity reference). The issue summary doesn't say it explicitly, but I'd assume the idea would be to do that here too (except stored in JSON).

Okay sure. At first it sounded like scrapping Drupal for JSON only definitions, storing only JSON/HTML/strings, and losing the in between power of Drupal APIs. I'm fine with that. Then they are not coupled directly to blocks, paragraphs, or similar approaches. They could be used directly in LB and free of those constraints, able to do what OP proposes.
Comment about 1 year ago →
🇬🇧United Kingdom longwave UK
a short hash (for example, the first 8 characters of a hash)

This feels like premature optimisation, and truncating a hash makes me think about the birthday paradox; unless someone can convince me otherwise with both collision statistics and performance numbers, I think we should use a full SHA1 or equivalent here.
Comment about 1 year ago →
🇺🇸United States effulgentsia
is it really that bad if the component field is stored as a large JSON string?

We have customers at Acquia with >50GB databases, mostly filled with duplicate (or nearly duplicate) paragraph revisions.

Part of the problem is that every revision stores records for every language, even though any given revision only edits one translation. That's not unique to paragraphs, it's just that with paragraphs you have so many more entities than you do with nodes alone.

For example, say for one node you have:

50 components

50 revisions per language

10 languages

400 bytes per component

That's 10MB per node. So if your site has 1,000 nodes like that, that would be 10GB.

Whereas, if we can reduce that last one to 20 bytes per component: an 8 byte short hash plus 12 bytes of overhead (quotes, commas, plus also needing to store the component instance ID), that can reduce the storage by 20x.

I think we should use a full SHA1 or equivalent here.

Yeah, given a 12 byte (or who knows, maybe even more) overhead, the difference between an 8 character short hash and a 28 character (base64 160bit) SHA1 hash ends up being only a 2x difference (20 vs 40 total bytes), so perhaps that's worth it if we think SHA1 has sufficient collision resistance for this purpose.

However, the other thing we can do is use a dynamic length hash. For example, start with just the first 8 characters, check the lookup table, and if that key isn't already there, or is there and has the same JSON value that we want to store, then great just use that. But if the lookup table already contains a different JSON for that same short hash, then for the new JSON, try again with more characters (e.g., increase from 8 to 16 and repeat, or maybe just go immediately all the way to a full SHA1 or an even longer and more collision-resistant hash). That way, >99% of the data can be stored with just the short hash, and only occasional data ends up needing the longer hash.
Comment about 1 year ago →
🇬🇧United Kingdom catch
Part of the problem is that every revision stores records for every language, even though any given revision only edits one translation at a time. That's not unique to paragraphs, it's just that with paragraphs you have so many more entities than you do with nodes alone.

This is the same 'problem' for normal relational SQL field storage like the body field though. Do those sites really need to keep every single revision that was ever created on their site? Maybe they could prune old ones, and have smaller databases.

What I'm concerned about here is:
1. What is the effect of the hash lookups on entity load and save performance?
2. What is the potential for data integrity issues (hashes pointing to nowhere etc.)?

For example say I want to delete some old revisions, how am I going to know when I can delete the content related to a hash, do I start having to query all the remaining revisions to check?
Comment about 1 year ago →
🇺🇸United States effulgentsia
That's 10MB per node

That was off by 10x. I only scaled by number of languages instead of by the square of that. But with languages, they affect both the number of records per revision, and also the number of total revisions (since often when you update the content for one language, you want to translate that update to your other languages, which requires a new revision for each language). I updated #8 accordingly.
Comment about 1 year ago →
🇺🇸United States effulgentsia
This is the same 'problem' for normal relational SQL field storage like the body field though.

In theory, yes, having a 20kb value for your body field would require the same storage and scale with languages and revisions the same way as having 50 paragraphs with 400 bytes each. In practice, I've never had a coworker tell me about a huge database they encountered for a customer where the cause of that size was many nodes/revisions of large values for the body field (or any other text field). It's almost always due to paragraphs. Either in general there's a lot more sites building big pages out of many paragraphs rather than a long body field, or perhaps it's just more common for that kind of page building to iterate through more revisions than longform articles do.
Comment about 1 year ago →
🇬🇧United Kingdom catch
fwiw I have a site with some very long body fields (not page building, just... a lot of text), it doesn't have content translation, and the database is around 5gb.

Either way, I think we should be exploring #2770417: Revision garbage collection → to deal with the 'lots of revisions' issue - a solution that will work for any field storage, rather than pre-optimizing one field storage for dozens of revisions in dozens of languages, at least unless there are cast-iron answers to the questions in #9 and anything else that could come up with sharing content between revisions like that.
Comment about 1 year ago →
🇬🇧United Kingdom longwave UK
An alternative is perhaps some kind of reverse-delta mechanism where we store the full version of the current/latest revision and only the diffs that are required to generate all previous revisions, this will be slower but given that looking at old revisions is a rare operation maybe the tradeoff is worth it.
Comment about 1 year ago →
🇬🇧United Kingdom alexpott 🇪🇺🌍
Discussed a bit with catch - especially the revision data issue. We both agreed that however we resolve the revision data issue (either compression or pruning) that it should be a separate and more generic solution than baking it into the component-based page building solution.
Comment about 1 year ago →
🇬🇧United Kingdom catch
I wrote up a rough comparison of the various approaches to page building including my understanding of this one (but before Alex B posted the full issue here), wasn't sure where to put it, so putting it here in a comment.

What are they all trying to do:

Content editors want to build content pages which are combinations of different components.

Site builders want to provide content builders with exactly the right amount of flexibility for the above - library of components that can be chosen from and content added to.

We (Drupal) want site builders to be able to do this in a way that has both good UX for them and content editors, but also is scalable and maintainable on small, medium and large/massive sites.

Paragraphs

Pre-layout builder. The default UX does not handle full page building, but layout_paragraphs makes it a full page building experience (without layout_builder), lb_paragraphs exists but with no usage.

Inheritor of 'field collection' model from Drupal 7 and earlier.

Components are 'paragraphs'
Each component type is an entity bundle with fields / view modes etc.
Page is built from multiple entities of different bundles via reference fields on the main entity.

Pros:
The library of components works for both site builders and content creators

Cons:
Doesn't have working integration with layout builder (unless lb_paragraphs does everything)
increases the number of entities on a site by an order of magnitude, severe performance and scalability implications for listing queries, search, even page rendering to an extent.
Makes things difficult/impossible for workflow/translation publishing systems since there is not one 'entity' to deal with but arbitrary amounts.

Layout builder + content blocks

Block_content + layout builder (+ https://www.drupal.org/project/lb_plus → )

Components are blocks
Each component type is a bundle with fields / view modes etc.
Page is built from multiple entities of different bundles via reference fields on the main entity, which can be combined with configured fields on the entity.

Pros:
Fundamental architecture is already in core
The library of components works for both site builders and content creators
A good solution where you have a small number of custom blocks in a 'landing page' situation.
Most improvements we would make here are also improvements for layout builder in general.

Cons:
Newer than paragraphs and was not developed into a full solution in core.
LB Plus is new and usage is much lower than layout_paragraphs (this is not necessarily a con if we change that).
Pretty much the same problems as paragraphs if used at scale for lots of content.

Layout builder + Field union:

A field union is defined in a config entity, and then stored as multi-column Drupal field on the main entity, not a reference field.

Field union can include arbitrary combinations of any field type Drupal supports, so same 'visible' data options as paragraphs or block content, but without the nested entity in between. https://www.drupal.org/project/field_union →

Each 'component' would need to be a different field on the entity bundle (i.e. one 'image gallery' field, one 'hero' field). Layout builder would then allow you to add multiple values of those fields, and the layout specifies the field and delta in presentation. You could then interleave e.g. image galleries and heroes by rendering only one delta at a time in the layout.

Pros:
No extra entities, so storage is all together and more efficient. This fixes the scalability, performance, and translation/workflow issues of both paragraphs and block content.
Is an extension rather than replacement of layout builder + content blocks, at least for a long time.

Cons:
Needs finishing off, and layout builder integration on top after that.
Can this handle components that need multi-value fields like image galleries? Although if it can't, a mixture of field_union and block content bundles would still work. If block content entities are only used for image gallery-type situations and field_union for everything else that fixes all the paragraph drawbacks except possibly extra things for site builders to know about.

Layout builder + Alex B's JSON combo field idea (working title):

This is very similar to field_union with two differences.

You would define 'field unions' as config entities in their own right - i.e. the component-level combination of fields, a bit like bundle/field config but without the entity type.

Then the field itself is a multi-value field with an 'allowed field unions' config. Content editors can then select the 'field union type' for each delta. So delta 1 hero, delta 2 image gallery, delta 3 text field.

This would rely on JSON storage (only for the field, not the entity), because it relies on storing arbitrary combinations of field data for a single field, which we can't do in SQL.

In layout builder, you could probably have identical UX to the field union idea, the difference is in the site builder UX and the underlying storage - both config and entity storage. Without concrete implementations it's hard to know whether one would be 'better' than the other, they might just end up 'different'.

Pros:
All the same pros as field_union.
Might be more flexible in some situations (?)
Specifically it probably makes much easier the situation where you would want a multi-value field within a component, like an image gallery, since json (and blocks or paragraphs) could handle that but multi-column SQL not so much.

Cons:
Doesn't exist in any form.
Relies on JSON field storage which also doesn't quite exist yet at least in a battle-tested way.

Appendix 1:

Some sites want part of the content to be in an admin defined layout (not just a template to be modified, but centrally updated), and some to be flexible.

- Fixed header with specific fields
- Flexible layout underneath with some arbitrary amounts of components on different pages

(think news where there always has to be an image and standfirst in the same place, or ecommerce where there always has to be a price and add to cart widget).

You could also want header -> flexible -> footer.

Some sites are using layout builder for the content header and paragraphs for the flexible content.

This could probably be done with 'stacked layouts' - i.e. define different layouts that sit one after the other fulfilling the different roles. Would be a new 'mode' compared to the current default/override.
Comment about 1 year ago →
🇦🇺Australia larowlan 🇦🇺🏝.au GMT+10
I did some investigation of the builder.io data model as part of decoupled LB, here's a sample payload for a page for those interested.

I also looked at Puck, here's the payload it uses for https://demo.puckeditor.com/

Renamed to .txt file for d.o upload.

For me the mix of styling info in the data is a bit odd, I could reason about that being stored as presets (like manage display) so you could change them across the site, having to edit each page to change styling feels like the same issue as HTML as a storage format.
Comment about 1 year ago →
🇧🇪Belgium wim leers Ghent 🇧🇪🇪🇺
@kevinquillen in #4: agreed with @catch's feedback in #5.

@catch in #5:

RE: field_union: the use case here is more complex than what field_union can provide though. For any particular SDC (say one with 5 props) we could create a union field to allow storing a value for each of those 5 SDC props. But … we need to store values for not just a single component, but for an arbitrary number of components. This too could still be achieved using a union field-esque approach, if the set of components is identical for each entity:node:article, each entity:taxonomy:term:tags or each whatever. But that is also not true: the intent is to allow content creators to perhaps start from the same initial set of components for each such entity, but to (optionally) allow adding additional arbitrary components.
RE: JSON size: I tentatively agree. The duplication of data results in simplicity. Plus, denormalizing it all is impossible for the reasons cited in my previous point. So AFAICT the concern is more duplication of data across revisions of the same entity. That is a much smaller problem already, and can in part be addressed by reducing revision history (a good thing to do even before all this) and by efficiently storing JSON, presumably by compressing it? Can we then still do queries into the JSON? https://mariadb.com/kb/en/storage-engine-independent-column-compression/ is sparse with information.
RE: JSON size: note that in this JSON we'd only be storing values for static props — props whose values are fetched dynamically from the host entity or a referenced entity through some expression/token would not end up in that JSON. The purpose is to encourage as much as possible to use Drupal's rich structured data model, and hence to only use static props for unstructured data. That'd make it smaller than at first sight. Second, unstructured data already is less likely to be repetitive, so less compressible.

@longwave in #7: I want to agree with you and instinctively I do, but #8 makes a pretty strong argument in favor of this. Curious about your thoughts!

@larowlan in #16:

100% agreed that when storing this information, this representation is odd. But if this format is preferable by the JS UI, then that's fine as long as we can transform to/from.
I think you're implying that in these examples, 100% of the style information is present 100% of the time. We definitely wouldn't want to do that. We should only store overrides compared to the baseline component, and even the JS UI-facing representation must always indicate what is an override vs not, because the UI would need to convey that to the user too.
Comment about 1 year ago →
🇬🇧United Kingdom catch
This too could still be achieved using a union field-esque approach, if the set of components is identical for each entity:node:article, each entity:taxonomy:term:tags or each whatever. But that is also not true: the intent is to allow content creators to perhaps start from the same initial set of components for each such entity, but to (optionally) allow adding additional arbitrary components.

I am thinking of something like (regardless of storage model):

There is an 'allowed components' setting on bundles, which then might be further refined by permissions (+ field access etc.).

When you are working on any individual entity, you have a choice of the allowed components.

--

For the JSON approach, it's just a UI setting probably (in config somewhere) which limits the available options.

For field_union, each 'allowed component' would have to be a different field union field on the bundle. But obviously you can add new fields over time for new component types you want to allow. Some of those fields would be completely empty on a lot of individual entities, that's already the case with lots of fields on lots of entities on lots of Drupal sites now.

Where field_union would struggle is if a component is an image gallery with an arbitrary number of images, in that case I think it would have to go back to a reference to an intermediate entity (block?) with an image gallery bundle, because each image is its own delta-within-a-delta then.

@longwave in #7: I want to agree with you and instinctively I do, but #8 makes a pretty strong argument in favor of this. Curious about your thoughts!

@alexpott and @catch in #14: yay, glad to read this issue doesn't need to concern itself with revisions, which I agree is a pre-existing and broader challenge :)

#14 is agreeing with #7, so if you're agreeing with me and @alexpott, you are also agreeing with @longwave, but then you say #8 is a strong argument - who are you agreeing/disagreeing with here?
Comment about 1 year ago →
🇦🇺Australia larowlan 🇦🇺🏝.au GMT+10
00% of the style information is present 100% of the time. We definitely wouldn't want to do that. We should only store overrides compared to the baseline component,

Yep, exactly this - glad we agree 👌
Comment about 1 year ago →
🇺🇸United States kevinquillen
https://www.drupal.org/about/core/blog/working-toward-an-experience-builder →

https://dri.es/evolving-drupal-layout-builder-to-an-experience-builder
Comment about 1 year ago →
🇺🇸United States jmolivas El Centro, CA
Since field_union was mentioned how about?
https://www.drupal.org/project/custom_field →

Have anyone take a look to Sanity's Specification for Portable Text?
https://github.com/portabletext/portabletext
Comment about 1 year ago →
🇺🇸United States apmsooner
Where field_union would also struggle a bit is if a component is something like an image gallery with an arbitrary number of images, in that case I think it would have to go back to a reference to an intermediate entity (block?) with an image gallery bundle, because each image is its own delta-within-a-delta then.

I think field_union is very similar to https://www.drupal.org/project/custom_field → in this sense. A field_union field like any other field can be multi-valued. An image gallery would just be a multi-valued field_union consisting of an image field and whatever other sub-fields are set. The values would all be stored in a single field table with deltas just like any other field. Think multi-value address field as an example... same storage mechanism just flexible field types to choose the columns. Furthermore, I think we need to stray from thinking everything needs to be an entity. In the case of an image gallery for instance, the normal consensus is probably to build a component referencing a bunch of media items cause media has been so heavily promoted as the go to for images now. Image galleries are typically though specific to the content it's attached to thus the images will likely never be reused. Having additional entity wrappers in this case are unnecessary IMO and add to the bloat.

Paragraphs are getting unfairly thrown under the bus here I think. The entity_reference_revisions module is the underlying module which also supports blocks and other entity types as well. This is a content modeling decision ultimately around do i need per field instance revisions or can i live with a normal entity reference field that updates globally wherever its used. Paragraphs just currently has the best UX IMO for multi-value page components which is why I think most people use it. And compared to other CMS's... if you need per instance reference revisions, paragraphs is a unique offering from Drupal that those systems don't typically support. Now additional nested paragraphs within paragraphs is where it starts getting really ridiculous.

As for overall bloat and entity_reference_revisions aside, the field api IMO is the ultimate source of bloat. Storing a simple string value for instance as its own table has alot of extra unnecessary data on top of it. Any field just like entity_reference_revisions instances has a corresponding revision field that gets inserts on parent entity save whether something changed or not. Field api has always provided less experienced developers a way to model content types in a very flexible way so I'm not knocking it, but the amount of joins involved with querying all these field tables would never be an ideal way of storing data in enterprise level sites with alot of content. Savvy developers will make custom field types that optimize data storage but until now with custom_field and field_union coming soon, there has been no standardized way of doing this without alot of custom code.

Ultimately i'm happy to see this discussion happening because actual content vs. settings has typically been mixed together as fields on entities and this JSON-based data storage idea can be used in alot of areas aside from just layout builder.
Comment about 1 year ago →
🇬🇧United Kingdom catch
An image gallery would just be a multi-valued field_union consisting of an image field and whatever other sub-fields are set.

That works if you have one image gallery with multiple images. What happens if you want two image galleries?
Comment about 1 year ago →
🇷🇴Romania amateescu
That works if you have one image gallery with multiple images. What happens if you want two image galleries?

At the storage level, the first gallery can use deltas 0 to 3 (contains 4 items), and the second gallery can use deltas 4 to 6 (contains 3 items). Then at the `TypedData::get()` level we can assign each delta item to its specific component (gallery).
Comment about 1 year ago →
🇺🇸United States apmsooner
That works if you have one image gallery with multiple images. What happens if you want two image galleries?

I would treat these as 2 separate field_union|custom_field types with likely the same setup. 2 image galleries attached to a single entity doesn't sound like a typical thing though.
Comment about 1 year ago →
🇺🇸United States apmsooner
Question @catch - technically could fields be completely detached from an entity? I don't even know if that's possible but if you could have named field instances with no reference to any particular entity then I "suppose" its something reusable that could be referenced anywhere. Maybe thats what this d7 module was trying to do: https://www.drupal.org/project/field_reference →
Comment about 1 year ago →
🇬🇧United Kingdom AaronMcHale Edinburgh, Scotland
the field api IMO is the ultimate source of bloat. Storing a simple string value for instance as its own table has alot of extra unnecessary data on top of it.

That's something which has always bothered me, why is it that base fields can quite happily just be columns on the same table, but each field created through the field UI must have its own table. (I guess that's why they're called base field heh)
Comment about 1 year ago →
🇺🇸United States ctrladel North Carolina, USA
Overall in favor of moving layout builder field storage from serialized php to json. Being able to easily read field values and query them would be great. Since the layout builder value is already exported to yaml for display mode configs it seems like it should be an easy jump to store it as json in the database.

An additional approach to the ones mentioned in #15:

Plain old plugin blocks
In this approach each component is represented as a plugin block in layout builder using the block's form to author the block within layout builder and then stored in the layout builder field along with the block's plugin id. This avoids the revisioning and proliferation of entities that come with paragraphs or content blocks. Downside is that the block and authoring form need to be defined in code(currently) with Form API. Since we have to use form API this results in losing access to an interface to configure the form/display and you can't use many use contrib modules that only provide fields and not form elements.

I've had pretty good success with this approach using an in house SDC like schema definition system and generated forms from schema using RJSF →

That works if you have one image gallery with multiple images. What happens if you want two image galleries?

With a component first approach you'd define two components an Image Gallery component to store multiple images and then a Two Gallery component with two slots where each slot would contain a Simple Gallery.

Other thoughts:
Perhaps a bit off topic for this issue but since the discussion is happening here I feel like there could be room for a whole new component entity type in addition to config and content entities. The definining characteristics of a component entity would be

built in support for props aka fields

built in support for slots aka drop zones/sections/regions that contain other components

built in support to template/layout props and slots

provides a way to restrict slots to only allow certain components

is stored as a singular json blob

Either one or both of:

can gracefully handle when a saved json blob does not match the entities prop/slot structure

leverages a working entity usage system so update hooks can reliably update json blobs when the component structure changes

isn't a standalone authorable entity but is instead assumed to live within a config or content entity

provides a way to transform prop values before passing them to a template(a reference field stores the entity id but in the component you always need a value from the entity like the title)
Comment about 1 year ago →
🇺🇸United States mglaman WI, USA
I think this is a great idea. I had one piece of feedback for the implementation, then just some notes as I went through all of the comments.

Represent the component instance tree as two single-valued fields

Using JSON-based storage and storing the structure and data in two fields. However, I'd prefer one field with two properties. In the end its the same, two columns on the table. But we can mark the properties as internal with a computed property that is the combined value for building the page. This makes it API-friendly for frontend consumption.

Regarding storage optimization, it should be split out to not add complexity to this storage. Whether old revisions are garbage collected or their contents dumped and encrypted to disk and loaded if ever needed (I worked on this before), it should be solved elsewhere. We aren't making the existing problem worse by following normal paradigms. This means we can more easily solve the problem for the impacted areas. I'm on the same stance as Wim in #17.

Relies on JSON field storage which also doesn't quite exist yet at least in a battle-tested way.

RE #15, you mean battel-tested in Drupal core? Because folks have been using JSON fields with Drupal in production for some time.

Paragraphs are getting unfairly thrown under the bus here I think.

RE #22. I don't think its Paragraphs specifically. It's just the module that most easily highlights storage complications with revisions.

That's something which has always bothered me, why is it that base fields can quite happily just be columns on the same table, but each field created through the field UI must have its own table. (I guess that's why they're called base field heh)

That's because they're bundle fields and may or may not be used. I think it'd be worth opening (or maybe one exists) that any field with a cardinality of one for a bundle field can go onto the shared table storage and not a dedicated one. Because base fields move to a dedicated table when cardinality is > 1.
Comment about 1 year ago →
🇧🇪Belgium wim leers Ghent 🇧🇪🇪🇺
Let's get a PoC of this proposal working. Without the hashing parts for now, for the reasons @catch outlined in #9 and reiterated in #12, as suggested by @alexpott in #14. Let's tackle #2770417: Revision garbage collection and/or compression → instead.

I am building a PoC that does not use anything like field_union as @catch suggested in #15 (search for the line in his comment). There's no "field union config entities" at all, not tied to the component level or anywhere else. (See my comment #17. To expand on the last sentence of that bullet: the same component can have its 3 props A, B and C fulfilled by either 3 statically defined values that would need one field union, or 2 + 1 dynamically retrieved one, where there's 3 possible such pairs. That's far too many field union config entities to be practical. @catch did touch on this in #18, where he mentions 👍)

@catch in #18: RE: my confusing comment 🫣 — all I meant to say was that it's a problem we'd have to deal with eventually, but I don't think we should deal with it here.

@apmsooner in #22:

That's simple to say as long as one doesn't need to worry about translatability, configurability or extensibility.
This is true, and is why I think field_union is a mismatch for Drupal's Experience Builder (announcement links in #20).

@amateescu in #24: I like how simple that sounds! 😊 But then I realized that tracking the deltas of each of these fields would become fairly complex/brittle to keep "routing"/mapping individual deltas to distinct components (destinations), unless we'd be able to use arbitrary keys as deltas (then we could just put UUIDs in there). But \Drupal\Core\Field\FieldItemList::setValue() enforces the use of numeric keys.

#27: because base fields are guaranteed to exist for all bundles of an entity type, right down to the code level. Fields defined in configuration may or may not continue to exist, and may or may not exist for all bundles of an entity type. There is only a single table per field for all bundles of an entity type though — so there could've been even more!

@ctrlADel in #28: — something along these lines is what we're thinking of doing for https://www.drupal.org/project/experience_builder → , details TBD. But before we get to that point, we need to get the even lower level data flow and storage fundamentals figured out first. This issue covers a subset of those fundamentals. That's also where what we discussed at DrupalCon Portland and which you opened 📌 Introduce an example set of representative SDC components; transition from "component list" to "component tree" Fixed for is very valuable: a set of more complex components that nest components to ensure that we get those things working too.

@mglaman in #29: I had the same thought! 😄👍
Comment about 1 year ago →
🇧🇪Belgium wim leers Ghent 🇧🇪🇪🇺
Comment about 1 year ago →
gabesullice
Another proposal for composed fields…

Let's conceptually separate tree structure from field composition. They're really separate concerns. But if we fix field composition, the tree problem becomes simpler. Let's set aside tree structure for now and focus on composed fields.

The difficulty with many of the existing solutions is that they have to retrofit themselves into the field module's framework as it exists, which has no concept or affordances for field composition.

A core-driven approach doesn't need to suffer from that missing abstraction.

The field module could do all the necessary bookkeeping to make it possible and it wouldn't be very complex with regard to the database (all the real complexity would be in the field_ui module).

Let's say you want to compose an image field and a formatted text field to create a "Picture" field that stores an image and a caption (imagine media wouldn't work because the caption is page-specific and different than alt text).

The table schema for each subfield would be indistinguishable from a typical image or text field, but instead of naming their tables node_field_image and node_field_caption, they'd be named node_field_pictur_image and node_field_picture_caption, respectively, where field_picture is the composed field's machine name and image and caption are the subfield machine names.

If the field module and entity system were aware of this composition all they would need to do is JOIN against both tables and then weave the field deltas together in PHP, instead of JOINing agaist a single table per field as they do today.

We would need a new ComposedFieldItemList class to access a ComposedFieldItem with magic methods to get at $node->field_picture->image->uri or $node->field_picture->caption->value.

That missing bookkeeping in the field module is essentially what Paragraphs is using the entity table schema for and why entity reference revisions is necessary. Or why field_union is forced to weave columns together in the field table schema.

Since storage is easy, the trick would be making and intuitive UI for building composed fields.

As for the field ecosystem, the problem with this proposal is that field formatters and widgets expect to receive their own field item lists.

To solve that, existing field types would need to become "composable" by implementing FieldItemWidget and FieldItemFormatters (note the added "Item").

How does that simplify tree structures?

It would mean effulgentsia's components don't need to store any field data.

The layout part of his proposal would remain the same, but the components would point to field deltas or block configuration.

If you're a site builder, you'd add fields for every allowable page element. But that would now include composed fields, so if you wanted a "Picture" component (with image and caption) you'd add a composed field for it.

Behind the scenes, every picture would be stored in the two picture subfield tables explained above and the JSON field would only store a reference to the field machine name and delta in its component column.

The picture field order would match the order of a left depth first search of the tree to match the top to bottom order of how they'd appear in HTML.

Field data doesn't need to be nested, only layouts do.

As for SDC, this still works nicely because one could define a component for the composed field and reuse that field across many bundles. Heck, across entity types of the composed field is configured separately à la Paragraph types.
Comment about 1 year ago →
gabesullice
Behind the scenes, every picture would be stored in the two picture subfield tables explained above and the JSON field would only store a reference to the field machine name and delta in its component column.

The picture field order would match the order of a left depth first search of the tree to match the top to bottom order of how they'd appear in HTML.

To expand on this, imagine that we have two components represented as two fields. A Picture element composed of an image and text field for a caption. And a Text element for narrative content.

If they're placed into a nested layout in an arbitrary order, it might look something like this if the Text elements are represented in blue and the Picture elements are represented in red.

That layout forms a tree like so:

In the database, there would be 4 tables:

node_field_text

node_field_picture_image

node_field_picture_caption

node_json_tree

The node_json_tree table would store a JSON object representing the tree from the diagram above. A node in that tree would have a reference like field_text:1 or field_picture:3. Note that it would not have a reference like node_field_picture_caption.

When loading the node, the system would JOIN from the node table to all of the tables above, but the field system would put rows from the composed field tables "together" based on their delta.

The render system would recursively descend the tree stored in a row in the JSON table, effectively calling $node->get('field_text')->get(1) or $node->get('field_picture')->get(3) depending on whether the tree element is red or blue or numbered 1 or 3.
Comment about 1 year ago →
gabesullice
I realized that those diagrams conflate the layout containers with fields. So here are two more technically "truthful" diagrams. I also attached a JSON representation of the tree field data → .

The box layout:
Comment about 1 year ago →
🇦🇺Australia larowlan 🇦🇺🏝.au GMT+10
FWIW @gabesullice this

The table schema for each subfield would be indistinguishable from a typical image or text field, but instead of naming their tables node_field_image and node_field_caption, they'd be named node_field_picture_image and node_field_picture_caption, respectively, where field_picture is the composed field's machine name and image and caption are the subfield machine names.

If the field module and entity system were aware of this composition all they would need to do is JOIN against both tables and then weave the field deltas together in PHP, instead of JOINing agaist a single table per field as they do today.

We would need a new ComposedFieldItemList class to access a ComposedFieldItem with magic methods to get at $node->field_picture->image->uri or $node->field_picture->caption->value.

Is basically what field_union module is doing - see the project page for examples interacting with the API or use the current branch in https://www.drupal.org/project/field_union/issues/3011353 📌 Add UI for adding unions (field_union_ui.module) Needs work for a working-ish UI.

https://www.youtube.com/watch?v=ZorUUuC8oxc goes in to the data model in more detail (For both field union and custom field).
Comment about 1 year ago →
🇳🇿New Zealand john pitcairn
The above would be great for straightforward views access to component fields, since they're just regular fields on the entity. That's been a real content model pain point with inline blocks in layouts (ie where you can't just use a simple field).

It would presumably make content moderation "just work" for multi-field components too.

How might we handle a "make this reusable" action, when the editor wants to re use a field-based component they added to this layout on another layout somewhere? That's a very common requirement.
Comment about 1 year ago →
gabesullice
FWIW @gabesullice this … Is basically what field_union module is doing

🤦 I'm sorry. That was faulty memory. I swear I saw a module that was reading field schemas and weaving them together into a combined schema, but I should have done more fact checking.

Regardless, that's awesome news!
Comment about 1 year ago →
🇬🇧United Kingdom catch
OK I was struggling to see the difference between @gabesullice proposal and field_union but I'm not familiar with the internals of field_union to know how close it actually was.

So for me the benefit of the @gabesullice/field_union approach:

1. It prevents having to re-implement logic like 📌 Prevent modules from being uninstalled if they provide field types used in an Experience Builder field Fixed .

2. Easier to expose field values to views via existing entity/field integration.

3. field_union has pure structured data use-cases outside of page building - say a track name + duration field on a recording content type.

Disadvantages:

1. If field values are updated programmatically, what would happen if they go out of sync with the JSON layout?
2. Similarly what happens if you delete delta 2 of a multi-value field, would you need to re-delta the other fields and their position in the layout, or would it leave a gap?
Comment about 1 year ago →
gabesullice
1. If field values are updated programmatically, what would happen if they go out of sync with the JSON layout?

I'm not quite sure what you meant by out of sync, but I think you were asking what happens if the field items are reordered?

If so, that feels like it should be treated as an undefined behavior. It reminds me of deleting an inline block programmatically. The result is a message that says "Broken or missing block". Not sure what to do here but it feels like a data integrity issue.

2. Similarly what happens if you delete delta 2 of a multi-value field, would you need to re-delta the other fields and their position in the layout, or would it leave a gap?

I think there should be a field validation constraint that prevents dangling references on the JSON field. It may need to be an entity validation constraint to have access to both fields.

If a field is referenced by the layout field, maybe that field should be uneditable outside of the visual editor. That way the layout field can be computed whenever the editor's form is submitted.

3. As discussed above, it will struggle to support multi-value field within a delta (e.g. two image slider components on one page, where the image field itself is multiple as well as the slider component being multiple) - although this is fairly extreme edge case and there might be workarounds.

I think this might be a faulty example/requirement.

You wouldn't create an image slider field union, you'd create an image slide (no "r") field and place them in a custom section type for a slider. E.g. o support two sliders, you'd have two slider sections. The first would be filled with deltas 0-3 and the second would be filled with deltas 4-7.

Maybe we need a pair of rules like:

Field unions are for cardinality-locked data. I.e. every subfield has the same cardinality and the same number of items.

Inline block content is for enforced field sets that are not cardinality-locked. I.e. if you have a component with a single image field but unlimited links then use a custom block type.

Note that rule 2 would not preclude you from using field unions in the custom block type. E.g. if you wanted those links to store a link and an icon choice, the block type would have an image field and a "Download link" field (that is a union of a link and an options_text subfields).
Comment about 1 year ago →
🇬🇧United Kingdom catch
Most of #39 makes lots of sense to me, but not sure about this:

If a field is referenced by the layout field, maybe that field should be uneditable outside of the visual editor. That way the layout field can be computed whenever the editor's form is submitted.

This would preclude REST/JSON:API and could be tricky for translations or other programmatic updates.
Comment about 1 year ago →
gabesullice
This would preclude REST/JSON:API and could be tricky for translations or other programmatic updates.

Warning ⚠️ this is a more free flowing brainstormy thought than a well reasoned suggestion/answer…

Perhaps the design tension you're feeling comes from wanting to directly update the storage model (field by field) vs. updating a single mutable representation (one big blob that combines everything into a nested object)

It's easy to update the blob and keep everything in sync, but inconvenient to store/query. Conversely, it's inconvenient and error prone to update a bunch of interdependent field tables, but easy to store & query.

What is we separate how it's updated from how it's stored?

We could invent a format that can be mutated and saved as a whole? On save, some smart system would translate the changes and apply them to the database using entities and fields as the underlying storage model.

It's not so crazy, that is exactly what would be happening in a visual editor using AJAX or a hidden form input. I.e. the editor would be transferring a "representation" in the application/x-www-form-urlencoded format, then the smart form system would read it and store the changes by making updates via the field API.

Could we come up with another mutable abstract representation that can be updated programmatically? Maybe a JSON doc or a PHP object with mutation methods similar to the Layout Builder's appendComponent. Then when the representation is saved, its state would be translated to field storage?

So this:

If a field is referenced by the layout field, maybe that field should be uneditable outside of the visual editor. That way the layout field can be computed whenever the editor's form is submitted.

…would become: "uneditable via the field API (there be dragons) but editable via the mutable representation"

And then the visual editor and REST et al. could use the same mutable abstraction and it would be the only safe/supported way to update data used in a layout?

tl;dr; wonky idea: use a single object to make changes, pass the updated object to a smart system, let that system atomically apply the changes using the field API so things can't drift out of sync.
Comment about 1 year ago →
🇬🇧United Kingdom joachim
We don't necessarily need to use JSON.

We could have a component field whose storage consists of a table for each data type it contains:

- node__field_component__text
- node__field_component__image
- node__field_component__url

Each of those table's values column then look like normal standalone fields.

However, we add columns for parentage and sibling delta, so:

- entity_id, bundle, delta - same as standalone fields
- parentage - indicates where in the tree of components this item is. No idea if we need a single parent here or a tree hierarchy string
- sibling delta - position of this item relative to its siblings
Comment about 1 year ago →
gabesullice
This would preclude REST/JSON:API and could be tricky for translations or other programmatic updates.

wonky idea: use a single object to make changes, pass the updated object to a smart system, let that system atomically apply the changes using the field API so things can't drift out of sync.

Making the implicit explicit:

I'm suggesting a new PATCHable resource at something like /node/article/{uuid}/layout would the jsonapi module's way of supporting editing fields referenced by the layout field so that the "smart system" would do the job of actually keeping all the fields in sync. That's opposed to forcing JSON:API clients to update 5 fields and the layout field correctly.

@joachim, what if each of the sections are a field item row using a similar number scheme? The value column of the sections field table would still be JSON though. Something like:

Section field item row 2:

[
  { "type": "field", "reference": "field_text:1" },
  { "type": "field", "reference": "sections:3" }
]

Section field item row 3:

[
  { "type": "field", "reference": "field_picture:1" },
  { "type": "field", "reference": "field_text:2" },
  { "type": "field", "reference": "field_picture:2" }
]

Comment about 1 year ago →
🇺🇸United States kevinquillen
One note, and I believe this may be addressed from Gabes comments and examples, but the ordering of data is important (in storage). This is one sticky area with Layout Builder component data at the moment, particularly in regards to headless or decoupled sites.
Comment about 1 year ago →
🇩🇪Germany Anybody Porta Westfalica
Great proposal here and fascinating to see the nice ideas behind.

As I saw some similarities here in the concept and evolve that for example Layout Paragraphs made from earlier Entity Reference Layout → module, might it perhaps make sense to contact @justin2pin → the creator of Layout Paragraphs → for some feedback and his learnings?

I think that might be really valuable feedback, if he'd be interested. BTW he's a really friendly and engaged part of the Drupal community. Still I don't know if this conflicts with ATEN's ongoing work at https://www.drupal.org/project/mercury_editor → but I don't think that's the way he thinks.

BTW it might also be interesting to ask maintainers of other Paragraphs alternatives for feedback and their learnings regarding the data structure at the right time?

https://www.drupal.org/project/component_builder →

https://www.drupal.org/project/paragraphs_blokkli →

https://www.drupal.org/project/bricks →

https://www.drupal.org/docs/extending-drupal/contributed-modules/compari... →

Ignore my comment, if you already did that or don't think it's helpful to ask them.

PS: This is huge for Drupal!
Comment about 1 year ago →
🇦🇺Australia larowlan 🇦🇺🏝.au GMT+10
Copying in comms from slack thread

Some thoughts from me

config entity that represents a component

for each component the site builder can say for each prop one of the following;

Option 1 Fixed value (set in config entity)

Option 2 Dynamic value taken from base entity - mapped in config entity, site builder picks widget and formatter

Option 3 Arbitrary value use this widget, use this formatter, mapped in config entity *may not be needed, read on

In the edit form for content editors the first option above doesnt show

Option two (Dynamic) shows but stores in content entity field, with reference in tree

Option three stores arbitrary typed data in the tree *may not be needed, read on

With this model, the config entity had a config dependency on any fields it maps to, as well as widget/formatters (module dependencies) so we can manage dependencies on fields etc

It also has an entity-type and bundle implied by way of the fields it uses. This limits available things in the component list, can't use a component that relies on fields that aren't there
The config entity can also record 'slots I work in' ala layout builder restrictions

I think that also lets you change the mappings later - want the 'hero title' component to use a different field for the image, change it in the content entity, no need to update the tree

I discussed this with @catch and he asked do we even need to distinguish between Option 2 and Option 3 - can we not just dynamically add fields to the content entity and remove option 3 - always storing values in fields on the entity.

@catch and I brainstormed that a bit - the example I used was this

let's say you have a tab's component, in the second tab, it uses a two-col layout, the left col has a slider in it, in that you have cards, each card has an image

In that scenario I guess you could pick out common shapes and make fields on the fly for them - eg, in that scenario it might spot the following:

image field on the card => media ref

title field on the card, title field on the tabs, title field on columns => string field

body field on column two, teaser field on card => text field

So that would be 3 fields and then the props could say 'tab 2 title is delta 3 in the title field', 'card image 3 is delta 3 in the media field'
This would mean you need to limit components per bundle - but that is something we'd probably want anyway - LB restrictions is already per bundle. So if you said 'this component can be used on articles' we would analyse the mapping and go off and make sure the required fields exists. (note this sounds a lot like what @gabesullice is proposing in #44)

My concern is I don't think we should be asking content-editors to 'pick the field you want to populate this prop from'. I think site-builders should make that decision ahead of time. If there's multiple options, they should make multiple components (config entities). I think asking content-editors to think about data-structures and fields is like asking them to think about formatters like LB field blocks, which is something we avoid on client projects. ***

I think content editors should just be filling in fields using widgets the site-builder decided make sense for a prop in an SDC, no different to what they do now for LB/Content forms/Paragraphs

With this in place the data model could store references to the config entity in place of 'type:field' in @gabesullice's example in #44

Then if we extend this further and add a 'type' to the component config entity - we could in theory write an adapter for inline blocks (layout builder) and paragraphs. These could be like the source plugins we have on media-types. The adapters could be a plugin and we could put a single plugin collection on the component config entity. We could interact with this to trigger rendering and prop evaluation.

*** @lauriii indicated he'd like to do user-research to confirm this
Comment about 1 year ago →
🇳🇿New Zealand john pitcairn
My concern is I don't think we should be asking content-editors to 'pick the field you want to populate this prop from'. I think site-builders should make that decision ahead of time.

Strongly agree with this.

Not sure about needing to define multiple components if there is a choice of prop fields. I can see that becoming tedious.

I can see a need for allowing the site builder to configure a component that would present, say, a select menu to the editor for choosing between a couple of predefined source fields to populate a prop. The site builder would be responsible for providing good option labels and help text to support the editor.

Not MVP, maybe something for contrib if the extension point is there.
Comment about 1 year ago →
🇸🇪Sweden johnwebdev
I’m curious how Field union or similar would/could work when business wants a multi value for a field in the item. E.g add tags to the Picture image example
Comment about 1 year ago →
🇦🇺Australia larowlan 🇦🇺🏝.au GMT+10
Re #49 it won't
Comment about 1 year ago →
🇬🇧United Kingdom joachim
When we're talking about config entities that represent components, are these fixed sets of fields, or are they templates?

Because I can imagine a use case where a content editor wants, say, a hero image component, or a slider component, but then wants to tweak it later on, maybe by adding a subheading or extra text.

If we consider the config entities to be templates which create the collection of subfields but then lose their connection, that sort of thing becomes possible.

To expand on what I was saying about using field-style tables instead of JSON, this is the sort of structures I had in mind.

Suppose our components field has these elements:

- Layout options - [ Hero - Layout options - Title - Subheading - Hero image ] - text - [ Accordion - Layout options - [ Slide - Layout options - slide image 1 - link 1 ] - [ Slide - Layout options - image 2 - link 2 ] - [ Slide - Layout options - image 3 - link 3 ] ]
Each set of siblings always has a layout field as its first item. That holds the layout information as JSON.

This field at this point would have the following tables storing its data:

Tables: - node__field_component__image - node__field_component__text - node__field_component__link - node__field_component__layout
If subfields of other types were added later, more tables would be created. E.g. a date field.

The table content for the text and image tables would look like this:

Table content: - node__field_component__text: bundle deleted entity_id revision_id langcode component_parents component_delta value article, 0, 1, 1, en, 1, 1, Title article, 0, 1, 1, en, 1, 2, Subheading - node__field_component__image bundle deleted entity_id revision_id langcode component_parents component_delta target_id article, 0, 1, 1, en, 1, 3, Hero image article, 0, 1, 1, en, 3:1, 1, slide image 1 article, 0, 1, 1, en, 3:2, 1, slide image 2 article, 0, 1, 1, en, 3:3, 1, slide image 3
So all the images, all the text, etc in the component are in the same table. The `component_parents` column says where each image is in the tree and component_delta says where it is in relation to its sibings.
Comment about 1 year ago →
🇧🇪Belgium wim leers Ghent 🇧🇪🇪🇺
#32: Yay, @gabesullice is in the house! 😄

Let's conceptually separate tree structure from field composition.

We've been doing that already. With "field composition" you AFAICT are referring to the Field Union-esque functionality that has been referred to above.

As mentioned in #30, I built a PoC that doesn't use any of that. The PoC was since finished — see https://git.drupalcode.org/project/experience_builder/-/merge_requests/15

It would mean effulgentsia's components don't need to store any field data.

AFAICT this would make things more complex, not less. For a single entity that uses 10 components (SDCs), with 4 required props per component (i.e. 40 props to populate), and with 50% of those populated by structured data on the host entity (20 props populated) and 50% by statically assigned values (i.e. stored in "fields" not part of the host entity type/bundle), that would mean 20 field tables to join. To gain … what exactly? 😅
It'd also mean that answering the question "where does the data for this entity live?" would have an incredibly complex answer — debugging sure would be more complex.

Related negative consequence questions:

Wouldn't this result in an explosion of the number of: 1) FieldStorageConfig + FieldConfig config entities, 2) DB tables? I'm less concerned about the latter, but the former is IIRC a real scalability problem — sites with lots of config run into this, and IIRC today that's only sites with lots of languages (i.e. config translations), but in the world you describe it would presumably be any site with a large number of components, which would be all sites?
Wouldn't this encourage ad-hoc data model building, where the components used determine the data model? i.e. Wouldn't this undermine the Drupal-as-a-data-modeling-and-structured-data-repository principle? Because:

First, only the components dictated at the entity bundle level (e.g. all article nodes) would be guaranteed to be present, and could be exposed via Views/JSON:API/GraphQL/…: components placed per-entity (or overridden per-entity) would not be available, even if some of that component data might contain critical information.
Second, it'd mean the majority of field names meaningless aka non-semantical names, because names are generated based on the component name rather than consciously chosen by the site builder.

In my interpretation of @effulgentsia's proposal, components must make one of two choices for each prop:

either reuse existing structured data of the host entity (i.e. a prop of a field that the Site Builder made part of the article node type)
or NOT use structured data (i.e. assign a static value, that is not part of the data model)

Result: if you want your "component prop data" to show up in Views/JSON:API/GraphQL/…, you MUST add a new (semantically named field, as is the case today) to your article node type, and then map/assign/use (terminology TBD) the data in that field to your component prop ("reuse existing structured data").

the components would point to field deltas or block configuration.

Pointing to field deltas is a data consistency risk, because field deltas are required to be numerical, and even consecutive: they're enforced to be 0, 1, 2, …. Multiple different components (as well as multiple instances of the same component) may use the same field type to populate their props. Components may be reordered. And they may be removed. The risk: Any small mistake in logic could result in the deltas getting out of sync, and the field system will unforgivingly renumber the deltas to be 0…n 👻 Renumbering is the high-risk piece (i.e. the weak link) in this approach.

#33 + #34: Aha, so you imagine one field table per component prop, so the same field type would not be reused across components. That answers a question I had for #32.

(I'm very confused what the meaning of the 4 digits is in each of the diagrams though: what are the blue 1/2/3 and the red 1/2? 🤔 I read and re-read the diagrams + comments multiple times, I still don't understand. 😅)

I see that @catch already raised equivalent concerns in #38. Great. I do not fully agree with the advantages listed though:

That's a tiny piece of logic that is trivial in the grand scheme of all things Experience Builder. You're right it'd be a benefit, but it'd be a tiny one.
I see how it's maybe technically easier, because it's the same low-level mechanisms. But as explained above: with fields named after components, plus multiple instances of the same component existing on one entity, plus few guarantees of which components will actually exist for a given bundle, plus Views heavily being centered on structured data and hence semantically named fields … I don't see how this will actually be an advantage?
Yep — Field Union is great for composing new fields for structured data. That you could then map/assign/use into component props! 🤓

@gabesullice in #41 + #43: None of that complexity would need to exist in the JSON-field-based approach above. The XB UI inevitably will be written in JS, and will inevitably use JSON data structures. There won't be a need for complex transformations like the one you describe.

@kevinquillen in #45: ordering is indeed crucial. The initial test for the storage layer (see below) explicitly tests that. I'm surprised that it's a "sticky area" for Layout Builder. Can you elaborate? 🙏

@Anybody in #46: I believe @effulgentsia looked at just about everything in drafting this proposal. Nonetheless, thanks for the suggestions — @lauriii also pointed to https://www.drupal.org/project/paragraphs_blokkli → , for how it manages editor state.

@larowlan in #47: thanks for bringing it all together! You're right that this issue only covers the low-level data model + storage piece, and not how it could (and should for XB per @lauriii's product vision) interact with config entities.

Work has started on the first of the necessary config entities. See 🌱 [META] Configuration management: define needed config entity types Active for the overview (although time constraints have not yet allowed @lauriii or I to expand that issue summary based on @lauriii's product requirements), but especially for the discussion(s) on that issue 🤓

Option 2 and Option 3 - can we not just dynamically add fields to the content entity and remove option 3 - always storing values in fields on the entity.

See my arguments wrt consciously constructing a content type's (entity type bundle's) data model with semantical naming. BUT! That being said: YES, when modifying the default layout/template/component tree (name TBD!) for a content type (entity type bundle), the UX SHOULD nudge the Site Builder persona towards creating additional fields! And initial versions of infrastructure to power that already have been proven to work (see below, and ✨ [MR Only] Edit any component prop, powered by a new FieldForComponentSuggester service, which will power the JS UI Fixed ).

My concern is I don't think we should be asking content-editors to 'pick the field you want to populate this prop from'. I think site-builders should make that decision ahead of time.

💯Agreed! What I just wrote above is perfectly in line with that 😄

I think that for the Content Creator persona, if they're allowed to place additional components, then for those additional components' props, we'd only provide two choices: either reuse existing structured data, or statically assign values. In the latter case we indeed should not ask them which field type/widget they want; that'd be a far too clunky UX.

I think content editors should just be filling in fields using widgets the site-builder decided make sense for a prop in an SDC, no different to what they do now for LB/Content forms/Paragraphs

+1 — in the screencasts below, I'm just showing how far we can get today, without such a component config entity existing. In that video, it's being surfaced to the Content Creator persona, because the necessary config entity doesn't exist yet (but that's being worked on by Felix in 📌 "Developer-created components": mark which SDCs should be exposed in XB Fixed , although that issue is just laying the foundations, what we're talking about here would be one of a number of follow-ups). The choice of which field type to use for a prop should be decided by the Site Builder persona, so that the Content Creator persona never has to deal with it. The video just proves that we can provide a good UX, even for the Site Builder persona 😊

@John Pitcairn in #47: +1 for that being possible but not MVP, and quite possibly left to contrib (although that's @lauriii's call as product manager).

Obviously there's many possible ways to go about implementing this. I went with a direction based on talking to @effulgentsia. It's totally possible we'll need to change its course if we run into a show-stopping blocker. That's why I wrote "PoC" in #30. I'm not principally opposed to any of these proposals, but I see more challenges than what the current PoC gets us, and where the importance of data modeling is not diminished

Most importantly: I also still think the trajectory XB is currently on is the simplest path to get something to work. As long as that remains true, I'll keep XB on the current trajectory. If the need for a significant pivot at the data storage level arises in the future: fine! Then we'll have a working starting point with test coverage, to refactor towards a different data storage approach. XB's most important feature must be the amazing UX. Getting to amazing UX with an imperfect data storage is better than the other way around.

https://git.drupalcode.org/project/experience_builder/-/merge_requests/15 works and looks like this:

I took it a step further in ✨ [MR Only] Edit any component prop, powered by a new FieldForComponentSuggester service, which will power the JS UI Fixed 's MR (https://git.drupalcode.org/project/experience_builder/-/merge_requests/20), which looks like this:

See the 2.5-minute screencast I shared in the #experience-builder Slack channel.

Also see the end-to-end test EndToEndDemoIntegrationTest.

Perhaps best of all: a GIF 🎥

— see ✨ [MR Only] Edit any component prop, powered by a new FieldForComponentSuggester service, which will power the JS UI Fixed (already being reviewed by @larowlan!)

I'm working on a diagram next that to visualize the mental map I constructed based on months of conversations with @lauriii, @effulgentsia and others. Think https://www.drupal.org/docs/8/api/render-api/the-drupal-8-render-pipeline → , but for XB 🤓
Comment about 1 year ago →
🇧🇪Belgium wim leers Ghent 🇧🇪🇪🇺
Tagging the obvious 😅
Comment about 1 year ago →
🇬🇧United Kingdom catch
Wouldn't this result in an explosion of the number of: 1) FieldStorageConfig + FieldConfig config entities, 2) DB tables? I'm less concerned about the latter, but the former is IIRC a real scalability problem — sites with lots of config run into this,

At the moment it is primarily a scalability issue due to 🐛 Reduce the number of field blocks created for entities (possibly to zero) Fixed which would be fully fixed by ✨ Add the notion of a 'configured layout builder block' to solve a number of content-editor and performance pain points Active (no more block derivers). That's something we should be fixing as part of this project so a non-issue by the time any of this gets into core. Also field_union would (I think) hugely reduce the number of config entities (one config entity per field_union supplying multiple props vs. multiple fields on a block content entity or paragraph).

There are other places where many hundreds/thousands of fields become a problem - views data is the main one. But they are all smaller problems and ones I don't think would be made dramatically worse at all.

Yep — Field Union is great for composing new fields for structured data. That you could then map/assign/use into component props!

What I don't understand from #52 is:

1. Why is it necessary to support both mapping structured data and also the 'static' (which to me seems like it should actually be 'dynamic'??) mode of (semi-)arbitrary content directly in the tree?

Or in other words if XB is going to handle structured data from fields, including field_union, mapped to component props, values editable from the XB UI and etc. then any problems associated with that have to be solved anyway, and what does the static props storage version get us on top of that?

Is it only this?:

See my arguments wrt consciously constructing a content type's (entity type bundle's) data model with semantical naming.
Comment about 1 year ago →
🇺🇸United States kevinquillen
#52 in a decoupled scenario, you'd want the data read out over JSON:API or GraphQL in the order it was saved (vs make the frontend figure that out over the wire). Decoupled/headless use is probably edge case (but there has been some interest). Right now it seems that LB stores ordering as 'weight' and its re-assembled on render (or perhaps a byproduct of it not being JSON:API ready). This was discovered here:

https://www.drupal.org/project/jsonapi_include_lb/issues/3374355 ✨ Rework module to work without a separate computed field Needs review

If new storage model eliminates that issue, all good.
Comment about 1 year ago →
🇷🇴Romania amateescu
@Wim Leers, re #52:

Renumbering is the high-risk piece (i.e. the weak link) in this approach.

Note that \Drupal\Core\Field\FieldItemList::setValue() is not set in stone, field types can specify a custom ItemList class where that method can be overridden to support the use-case of assigning various field deltas to different components. Also, the other place that does renumbering is \Drupal\Core\TypedData\Plugin\DataType\ItemList::rekey(), and that can be overridden too :)
Comment about 1 year ago →
🇬🇧United Kingdom catch
@Wim

Wouldn't this result in an explosion of the number of: 1) FieldStorageConfig + FieldConfig config entities, 2) DB tables? I'm less concerned about the latter, but the former is IIRC a real scalability problem — sites with lots of config run into this, and IIRC today that's only sites with lots of languages (i.e. config translations), but in the world you describe it would presumably be any site with a large number of components, which would be all sites?
Wouldn't this encourage ad-hoc data model building, where the components used determine the data model? i.e. Wouldn't this undermine the Drupal-as-a-data-modeling-and-structured-data-repository principle?

To add to #54, I don't think it would at all with a field_union approach.

For example, let's say you have a component which takes an image, title and entity reference - the idea is to feature content elsewhere on the site but using an image and title that is specific to the layout and wouldn't necessarily match what's on the referenced article at all.

This could be a 'featured entity card' field_union, with entity ref, image, alternate text, and title columns. The field name is semantic, the column names can just describe what they do 'content_target_id, media_target_id, alt_text, title' or whatever.

The actual field on the entity then is only field_featured_entity_card. No massive explosion of field config entities and field names at all.
Comment about 1 year ago →
🇺🇸United States effulgentsia
@Wim Leers asked me to chime in here. There's a lot of great comments on this issue that I haven't digested yet, so I'll save a longer comment until after I do, but what I want to highlight in this comment is that the essence of what I originally proposed is that there are two distinct pieces of data: tree and props. In this issue's summary, I originally proposed that as two fields. #29 suggested instead making it two props of a single field, which is what's currently implemented in https://www.drupal.org/project/experience_builder → , which I'm okay with for now. There might be benefits to changing it at some point to two fields, but I do like the conceptual simplicity of keeping it all in one field, so I think it makes sense to keep it as one field until there's clear benefits to changing that.

tree needs to store a hierarchy of arbitrary depth. JSON seems like the obvious choice for that. #42/#51 point out that it's possible to store a hierarchy purely relationally (as Drupal already does for menus and taxonomy), but ugh, why would we choose that when all the databases that we care about support JSON now? As an analogy, it's technically possible to store a string as multiple records of a position integer and an integer representing a UTF-8 code, but why do that when strings are available as a data type?

Because tree stores the hierarchy, props doesn't have to. props is just a flat array of components and their prop values/expressions. https://www.drupal.org/project/experience_builder → currently implements it as a single JSON blob. If at some point we want to split tree and props into two fields, instead of two properties of a single field, then props could be changed to a multi-valued field, where each item would be the JSON blob for just a single component.

Or if we want, instead of a JSON field, props could be a multi-valued field_union field. Except, I think there'd be a few potential problems with this:

Some SDC props are themselves objects/arrays. These would then need to be JSON sub-fields within the field_union. So field_union wouldn't entirely get us away from JSON, it would just push the need for JSON one level down. Pushing JSON one level down might have some benefits, so I'm not saying this necessarily makes field_union a bad suggestion, but I just don't think it actually solves all that much for this use case (it's a fabulous module for plenty of other use cases).

If you want to build a page (node) with 20 different SDCs (not just 20 instances of the same SDC), similar to building a page (node) with 20 different paragraph types, wouldn't you then end up with 20 different field_union fields (one for each SDC) added to that node type? While this can technically work (tree can make sure all the components get rendered in the right place even if their prop values are spread across items in different fields), it just seems like a lot more complexity and hassle than using JSON. However, please correct me if I'm misunderstanding how field_union works.

Like I said, I'll write up more later after I digest the rest of the discussion in this issue, but I wanted to at least touch on the above highlights in the meantime.
Comment about 1 year ago →
🇬🇧United Kingdom catch
If you want to build a page (node) with 20 different SDCs (not just 20 instances of the same SDC), similar to building a page (node) with 20 different paragraph types, wouldn't you then end up with 20 different field_union fields (one for each SDC) added to that node type?

The maximum is 20 field_union fields for 20 SDCs. However if SDCs take the same props (e.g. title, image, alt, description), then potentially one field_union field could be mapped to multiple SDCs, e.g.:

Hero -> title/description/image/alt/entity ref - field A
Formatted text -> text (+format) -> field B
Card -> title/description/image/alt/entity ref - field A

While this can technically work (tree can make sure all the components get rendered in the right place even if their prop values are spread across items in different fields), it just seems like a lot more complexity and hassle than using JSON.

If we take field_union out of the discussion temporarily, as I understand it, XB is already having to support 'map field API fields on the entity to XB component props'.

For example, an article node type. Let's say it has the following field API fields:

title (required)
lead image (media ref) (optional)
standfirst (text) (required)
tags (term refs, unlimited cardinality) (optional)

These need to be field API fields for views integration. Show a list of titles. Show a grid of cards with title + possibly image. Use 'similar by terms' module to build a 'related articles' view. All the things people do with field API fields now.

But in XB, you want these to be editable via the layout builder, and they could possibly be rendered via different SDCs (hero image with standfirst underneath on one article, image and standfirst side by side on another, standfirst in a big font with a background when there's no image etc.).

If XB already has to support this, then the field_union case is not extra complexity on top of that, it's already baked in.

So the choice then is not:

prop values in fields referenced from the JSON tree vs. prop values in the JSON tree

but instead:

prop values in fields referenced from the JSON tree AND prop values in the JSON tree vs. prop values in the JSON tree
Comment about 1 year ago →
🇫🇮Finland lauriii Finland
The maximum is 20 field_union fields for 20 SDCs. However if SDCs take the same props (e.g. title, image, alt, description), then potentially one field_union field could be mapped to multiple SDCs, e.g.:

Hero -> title/description/image/alt/entity ref - field A
Formatted text -> text (+format) -> field B
Card -> title/description/image/alt/entity ref - field A

If we merge components to shared database tables, it means that having content for Card components in the table could prevent making changes to structure of the Hero component. This goes against the desired UX/DX. Ease of use is a top priority and authoring components is in the critical path for our users. Therefore we want that experience to be as seamless as possible, and should avoid optimizing the data model in a way that leads into these types of sacrifices.
Comment about 1 year ago →
🇧🇪Belgium wim leers Ghent 🇧🇪🇪🇺
#59:

However if SDCs take the same props (e.g. title, image, alt, description), then potentially one field_union field could be mapped to multiple SDCs, e.g.:

This would increase the structured data model/semantical names concerns I raised in #52 (and which you previously responded to in #57).

If we take field_union out of the discussion temporarily, as I understand it, XB is already having to support 'map field API fields on the entity to XB component props'.

Correct, it must support that to support structured data.

These need to be field API fields for views integration.

Not all data must be available for listing in Views. Many component props do not make sense to expose in Views. For example: a "New Year's hero" component, which would have statically assigned values: a statically defined image, and a statically defined text (silly marketing nonsense like "Celebrate the new year with the Drupal community!"). That hero would be placed in e.g. the default layout of the article node type. It would never make sense to expose neither the text nor the image for the "Christmas hero" in Views.
That's the distinction I was capturing in #52, see this bit.

The text + image would be stored as static prop sources, without "backing fields", i.e. no FieldConfig, not "structured data", not part of the data model.

So, for the XB field's props column, that'd mean this data:

… "new-year-hero-uuid-here": { "text": { "sourceType": "static:field_item:string", "value": "Celebrate the new year with the Drupal community!", "expression": "ℹ︎string␟value" }, "image": { "sourceType": "static:field_item:image", "expression": "ℹ︎image␟{src↝entity␜␜entity:file␝uri␞0␟url,alt↠alt,width↠width,height↠height}", "value": { "alt": "A celebrating Druplicon!", "title": "", "target_id": "3", "width": 175, "height": 200 } } }, …
(that image-field-conjured-out-of-nowhere did create one new File entity that is referenced).

I suspect that the misunderstanding here is perhaps rooted in @catch assuming ALL data feeding into SDCs must or should come from "real fields", whereas that's expressly not the intent. In the example above, we've placed a component with values that would never make sense to surface in Views-powered listings.

Note also that in the above example, the exact same JSON blob can be used:

in the default layout for the article content type, and then no data at all would be stored per-article for this component instance (IOW: this component instance is meant to be static across all articles — see @larowlan's #16) — this is explicitly stated in 7. Content type templates in the product requirements

in a new-and-details-TBD "Experience Builder Component" or "Experience Builder Pattern" config entity that allows creating a component tree (or even single component) to reduce repetitive actions (see 29. Layout patterns and 32. Pattern Library for Page Builder and 4. Component creation for Theme Builder in the product requirements), and a consequence of these requirements is that you'd be able to de facto "reduce" the number of available props to populate by having specified a default (static) value for an SDC prop (issue: ✨ Allow specifying default props values when opting an SDC in for XB Fixed ) and then having "locked" it

In that latter case, there definitely is not an entity context in sight. The concept of a static prop source (final name TBD!) is hence something that can exist in the full spectrum of use cases for components:

from kinda-low-level defining what concrete components (as opposed to "pure SDCs") are available inside XB — with no content entity in sight

to per-bundle default layout ("content type templates" in the product requirements) — with the structure of a content entity in sight

all the way to per-entity overrides if the value is decorative rather than semantical/structural — with a concrete content entity in hand

As promised at the end of #52: I'm working on docs + diagrams. But this discussion seemed important enough to set that aside for a bit.

I *hope* there is more clarity now. I suspect that it's "just" the Experience Builder product requirements not having been internalized by all people interested in/excited about XB. Which totally makes sense: there are no expansive wireframes yet, and reading the product requirements spreadsheet is not really conducive to building a complete mental model. I know @lauriii is working hard on making detailed wireframes a reality 😊
Comment about 1 year ago →
🇬🇧United Kingdom catch
I suspect that the misunderstanding here is perhaps rooted in @catch assuming ALL data feeding into SDCs must or should come from "real fields", whereas that's expressly not the intent.

I'm not assuming this at all.

What I'm saying is that the current design assumes that we need to mix and match data from real fields with 'static props', and I am not sure there's a convincing reason to do that, compared to using 'real fields' for everything.

Not all data must be available for listing in Views. Many component props do not make sense to expose in Views.

They might not all, but there are a lot of grey areas like the image example above. And the problem is that it is extremely hard to predict this in advance, on at least two levels. Sites aren't built with all requirements decided in advance, so even people who understand the consequences of decisions may not be able to choose confidently. But new Drupal users who don't understand the difference between field-backed entities and static props (let's face it, there is enough talking past each other on this issue by people who we'd hope do) will neither be able to predict nor understand the relative consequences of a bad decision.
And compounding this, once you have decided, you're pretty much locked once you've got content, and if you made what turns out to be a bad choice, this can compound your issues over time.

In my very early Drupal days I made the choice to use an alpha version of CCK instead of flexinode, and that probably saved me dozens or hundreds of hours of work later, but if I'd not picked CCK and gone with flexinode instead, maybe I'd have run into a dead end in 2006/7 and not even be here now.

One more example:

Let's take a text heavy site - say most of its content is 'long read' 1.5-5k word articles.

It uses a 'formatted text' component for paragraphs (or sometimes a handful of paragraphs, actual paragraphs not Drupal paragraphs) interspersed with images, sidebars etc. They have a 'formatted text + pull quote' component, which has formatted text + plain text pull quote + attribution. This pull quote is then shown above/left/right of the paragraph it's taken from inline.

When the site is built, there is no intention to use the pull quotes outside the default view mode of the article at all. But then they revamp their home page, and they want to be able to feature articles using an image and pull quote, skipping the title (for clickbait-y reasons probably).

At this point they have hundreds of articles with pull quotes.

Now either:
1. The component is already backed by a field union field because we didn't give them a choice - now they can use the first pull quote, or even an arbitrary configured delta for that field, when featuring the article.

2. The component is using static props, now they can either
- use a not-yet-existing method to pull the first available pull quote prop from the JSON tree structure
- add a new field and migrate all of their data into it, including removing it from the JSON tree
- or they can duplicate the pull quote into structured data when featuring an article (a 'landing page featured article component pull quote prop').
- add a new, hidden, field to the article content type for 'featured pull quote' which duplicates the data from the prop, added when an article is going to be featured.

Now, there is also the possibility that they never, ever want to use the pull quotes outside the article context, like your New Years message example, but then what is the worst that can happen? Some extra fields available in views that never get used?

For example: a "New Year's hero" component, which would have statically assigned values: a statically defined image, and a statically defined text (silly marketing nonsense like "Celebrate the new year with the Drupal community!").

Why would you define an entire component for this use case though - i.e. what makes this a 'New Year's Hero' vs. just a Hero, except the actual content?

in the default layout for the article content type, and then no data at all would be stored per-article for this component instance (IOW: this component instance is meant to be static across all articles

This is configuration, which is not entity field data. I can see wanting to store static stuff at the config level in some way, but it has inherently different data integrity and storage considerations to entity fields.

in a new-and-details-TBD "Experience Builder Component" or "Experience Builder Pattern" config entity that allows creating a component tree (or even single component) to reduce repetitive actions

This is also a config entity.

In that latter case, there definitely is not an entity context in sight.

There may not be a specific entity context, but if components are limited by entity type and/or bundle, then there almost certainly will be an 'entity type/bundle' context and that then allows stub values to be provided for such a template via existing mechanisms like default values for fields.

A bit short for time so please excuse both the length and the brevity...
Comment about 1 year ago →
🇳🇿New Zealand john pitcairn
@Wim Leers:

That hero would be placed in e.g. the default layout of the article node type. It would never make sense to expose neither the text nor the image for the "Christmas hero" in Views.

I've certainly been asked to do things like a slideshow of featured articles using their hero image, article title, field from a related entity, etc. I definitely want to be able to get at that sort of thing via views.
Comment about 1 year ago →
🇦🇺Australia larowlan 🇦🇺🏝.au GMT+10
In this focus on fields and SDC components we have to make sure we don't forget about blocks. Embedding listing components (eg a view) in a page is a common use case that doesn't fit the current data model. Yes I know there's a block field module, but that's an extra layer on top that we currently don't need with layout builder
Comment about 1 year ago →
🇺🇸United States effulgentsia
I agree with #62 that it would be nice for even the static prop values within a node's XB components to be queryable with Views.

If we go with JSON storage for those prop values, this would require writing Views plugins for getting at the data that's inside of that JSON.

If we go with field_union storage for those prop values, we'd get the Views integration for free, but the downsides would be:

For every XB-enabled node type we'd need to create a field_union field per SDC that the site builder wants to use within that node type (and we'd need to programmatically add/remove those fields whenever the site builder (or code) changes which SDCs to enable for XB usage on that node type.

Not all SDC props map to fields. That's why https://www.drupal.org/project/experience_builder → currently maps them to TypedData objects instead and is working on an adapter system to handle non 1:1 cases. I don't know off-hand how we would address these things with a field_union approach, but maybe it's possible.

For SDC props that hold non-scalar data (objects/arrays), those values might need to be in JSON, so even with a field_union approach we might want to end up writing Views plugins for data within JSON, at which point we could have just done that to begin with.

On balance, my hunch is that writing Views plugins for JSON will be less hassle than addressing all of the things that we'd need for a field_union solution, but of course my hunch could be wrong.
Comment about 1 year ago →
🇬🇧United Kingdom catch
I agree with #62 that it would be nice for even the static prop values within a node's XB components to be queryable with Views.

It would not only be views integration, it would also be making them available to manage display for different view modes. Also if we added views integration for them, then the 'benefit' of them not being available to views per Wim's point in #61 gets undermined.

Let's go back to the article example from #59.

Someone creates an 'article' XB type, and doesn't add a 'lead image' field API field because not every article will have one (a reasonable assumption). They then start adding content, a lot of the articles use the same lead image component at the top of the article, some don't have any image at all, just text.

Later on, they want to customise their tags listing page with views, so that it is a responsive grid of cards.

For the cards, they make a new 'card' view mode, and they want it to show the node title and the lead image if there is one.

If the 'lead image' component is backed by a field API field, then when they go to manage display for the 'card' view mode, it will be there available to use.

If the 'lead image' component is using static props in XB with no field behind it, then it's just not going to be there. This would be the case whether using field UI manage display or a bundle-level XB/layout configuration for the card view mode.

Neither manage display, nor XB, are going to be well placed to reference the 'static prop from the XB field for a different view mode' from another view mode. For manage display it would probably mean providing a computed field or something, all to avoid using fields in the first place.

Another way to create something 'card-like' although I would never recommend it to anyone, is using a 'views fields' display, that would require the views integration @effulgentsia mentioned above, but then you lose all the re-usability of view modes when you want to use the same card style in a different view another time.
Comment about 1 year ago →
🇧🇪Belgium wim leers Ghent 🇧🇪🇪🇺
@catch in #62:

Sorry, my suspicion-for-misunderstanding seemed plausible, at least now it's clear that that is not the case! 👍

RE: "hard to predict in advance": agreed!
RE: changing "static fields" to "real fields": yes, this would require significant extra infrastructure.

Why would you define an entire component for this use case though

This kind of use case (see also the "pattern" bit at the end of my comment #61) is in the product requirements. So I'll let @lauriii answer this.

This is configuration, which is not entity field data. I can see wanting to store static stuff at the config level in some way, but it has inherently different data integrity and storage considerations to entity fields. We don't need to worry about views use cases for bundle-level config entities at all.

That's not the point. The point is that not all of the component tree data (both component instances + the values for their props) need to live inside each individual entity, because that results in an update nightmare: changing the default layout for the "article" content type would then require each individual article entity to also be updated. That's what this is referring to. That is a design flaw in for example Acquia Site Studio we specifically want to avoid.

For example: say we're configuring articles to always have the same component at the top and the bottom, with complete content creator freedom in between the two. The bottom one has solely static props (with their static values defined in the bundle-level config entity), and the top one a mix of static and dynamic props: e.g. the article title is dynamic, but e.g. the background color is a static prop value for all articles). When interacting with an article, for the editing and viewing experience, it'd appear as if that top and bottom component are inherently part of the article, but at the storage level, the tree + props field properties/DB columns would not store the top and bottom component. They'd be retrieved from the config entity instead: the hydrated field property will combine the "local" (content entity) info with the "bundle" (config entity) info.

Now, to ensure that data that conceivably could be reused/exposed via listings is reusable, the UX where you create the default layout for articles should make it easy to add fields to the data model.

I don't know what you're referring to when you write , but I'm hoping that the above helps paint a clearer picture of where the product requirements are taking our thinking.

But the fact that you can't use content entity field values in config entities is not a reason to avoid using content entity field values in config entities.

I explained above how we do envision that being possible.

(I know that there should be docs + diagrams, I'm working on those, but am interleaving that with responding here because I can sense some nervousness.)

@effulgentsia in #65: in addition to all that, we'd need to be able to say that for some SDC prop's field_union field, we should ignore one of the "unioned fields", because its value should actually come from a field in the entity type's data model (e.g. the Article's "Image" or "Title" field).

@catch in #66:

It would not only be views integration, it would also be making them available to manage display for different view modes.

Different view modes may have vastly different layouts (component trees). Asking a user to pick from dozens to hundreds of field union fields to avoid repeated data entry seems like quite the UX challenge? (🚨 I bet you have something different in mind.) This is why I've understood XB's product requirements to aim only for data that is actually stored as "structured data in fields in the data model" when crafting per-view mode component trees. (Perhaps 41. Conditional display of components could allow for components shared across view modes.)

[…] Later on, they want to customise their tags listing page with views […] If the 'lead image' component is using static props in XB with no field behind it, then it's just not going to be there.

This is true.

Neither manage display, nor XB, are going to be well placed to reference the 'static prop from the XB field for a different view mode' from another view mode.

This is also true.

But the assumptions I've understood in hearing @lauriii talk through the product requirements is that A) not all SDCs' prop values make sense to be reusable, B) existing Drupal data model best practices should remain unchanged (plus we shouldn't be growing data models to be unwieldy). From that POV, it makes sense to allow for static values that aren't meant to show up anywhere else.

🚨: This is yet another example of where having concrete wireframes to look at would make a WORLD of difference in us all getting on the same page. @lauriii is leading the effort on that, and I hope he will be able to help ground this conversation more in the destination XB aims to reach.

I am going to stop responding here for a while and focus on actually writing docs + creating diagrams.
Comment about 1 year ago →
🇺🇸United States apmsooner
Ya know Panels → + ctools was doing all this type of stuff being discussed over 10 years ago. The fields were "panes" either from current node context or some other other entity context. They could be rendered in whatever view mode needed. So a field_union type field can also have different rendering depending on view mode selected (ie; hide/show/format subfields differently).

Other stuff like views, blocks, custom content panes, etc... were all able to be pulled into a layout. There were contextual rules built for who sees what and various layouts that could be assigned to a page. I'm not saying it was perfect but I built alot of complex pages back in the day using panels when I was just a site builder. I assume there must have been valid reasons beyond my understanding to shift to layout builder. I only really build decoupled sites so I don't really have that much of a dog in this fight anyway but IMO layout builder was/is a considerable downgrade from what panels was already doing several years ago. I understand the panels configuration was still geared a bit more to site builders vs content authors but I personally think alot could have been (and maybe still be) adopted from that architecture that for whatever reason was just shelved. Perhaps I'm alone in thinking we're perhaps reinventing the wheel here maybe just a little bit?
Comment about 1 year ago →
🇬🇧United Kingdom catch
The point is that not all of the component tree data (both component instances + the values for their props) need to live inside each individual entity, because that results in an update nightmare: changing the default layout for the "article" content type would then require each individual article entity to also be updated. That's what this is referring to.... They'd be retrieved from the config entity instead

So configuration, stored in the config tables and exported to YAML. Specifically, none of this would end up in a JSON field on the entity which the proposal here is for. i.e. https://git.drupalcode.org/project/experience_builder/-/blob/0.x/src/Plu... cannot be added to a configuration entity.

I guess there is a possibility that when experience builder is applied to config (view modes, navigation module, full site chrome) it could use JSON-IN-YAML? But that's not implemented yet, the proof of concept in XB is specifically for entity content and that's also how I understand the original proposal here from @effulgentsia to be - otherwise why talk about JSON at all if we're dealing with config?

I don't know what you're referring to when you write inherently different data integrity and storage considerations to entity fields

Configuration entities have config dependencies which are calculated when the configuration is saved.

Configuration entities, even though their content depends on configuration, do not have the concept of 'config dependencies', instead we have validation when fields are deleted or updated which checks whether there is content in those fields or not. Config entities also don't get rendered, have view modes, all the other things that content entities need.

So the idea of 'static props' in a config entity is not a big deal, the discussion here has always been (for me at least) about what should be available to site builders and editors when layout builder is used to build the layout for a single content entity (layout overrides now), which is where all the issues with scalability and nested entities from custom blocks and paragraphs come into play.

They're really not issues for layouts-as-config-entities at the moment with the current data model and won't be in XB either regardless of what the data model is. All the scalability, data modelling, and irreversible database schema decisions happen with content entity page building.
Comment about 1 year ago →
🇧🇪Belgium wim leers Ghent 🇧🇪🇪🇺
#68: the UX envisioned by @lauriii for XB must be far better than what Panels allowed for. The UX for XB must be far better than anything the Drupal ecosystem has to offer today.

#69:

So configuration, stored in the config tables and exported to YAML. […]

Correct.

I guess there is a possibility that when experience builder is applied to config (view modes, navigation module, full site chrome) it could use JSON-IN-YAML?

Yes, default layout for articles would be stored in config, as would default layout for any entity type+bundle.

But that's not implemented yet, the proof of concept in XB is specifically for entity content and that's also how I understand the original proposal here

Correct, the 0.x branch is not there yet.

The combination of no PoC code for that existing + the overall UX not having visual representations to look at (other than the Figma file Q3 2023, which shows a completely blank canvas to as the starting point for a new node), is what makes this conversation difficult. (See the 🚨 at the end of #67.)

I hope you see I'm very much trying to give you detailed answers, but only either the PoC or concrete wireframes can IMHO get this conversation to the next level.
Comment about 1 year ago →
🇬🇧United Kingdom catch
But that's not implemented yet, the proof of concept in XB is specifically for entity content and that's also how I understand the original proposal here

Correct, the 0.x branch is not there yet.

Right, and the proof of concept already allows for storing static props in entity content in the XB field, which is why 📌 Prevent modules from being uninstalled if they provide field types used in an Experience Builder field Fixed came up (the issue that led me to re-read this one again). My only (or at least main) objection is to storing things that are entity content as static props instead of field-backed, something which has already been implemented and is also being explicitly argued for in this issue and elsewhere.

There is no way we could ever add field union fields to config entities, so the discussion is just irrelevant for those, anything on config entities can't be stored in entity fields by definition. If there is a need to store static props in configuration entities that don't exist yet, that very well might be the case, but that's not the objection I'm raising at all.
Comment about 1 year ago →
🇦🇺Australia acbramley
I am going to stop responding here for a while and focus on actually writing docs + creating diagrams.

I think this would be really good, right now it's incredibly difficult for anyone that's not deep in the weeds with this stuff to follow along. Diagrams will help immensely with mental mapping when reading these lengthy discussions.

My one question is - Are we basically dialed into this proposed architecture from the IS? Or are we still exploring ideas such as the alternative presented in #32 and other ideas elsewhere in this issue? If so I think it'd be great to get a summary in the IS about these different approaches and what ones are really being considered. Again this will help people (like me) to skip over discussions that don't need to be understood and focus more on understanding the stuff that is likely to happen.
Comment about 1 year ago →
🇳🇱Netherlands casey
What about:

1. adding a new entity storage handler that stores all configurable fields in a single JSON database-column
2. adding dirty checking to entity API so we can prevent saving unchanged entities
3. adding paragraphs (and entity_reference_revisions) to core using this new storage handler

Comment about 1 year ago →

🇺🇸United States apmsooner

@casey, my understanding is they are trying to eliminate perhaps the paragraphs model being in the mix of layout builder. I feel like we need to define some sort of desired json data structure for the layout object that consists of references to different things. I'm providing a proof of concept idea here sort of similar to the builder.io example file but specific to Drupal. What I "think" we need is:

Field(s) from current entity
Field(s) from some other entity via context? (Maybe a way to add additional contexts similar to how Panels worked...)
Field properties? - this is the part where field_union may come into play if you wanted to extract a specific sub-field value from the field.
Various entities in desired teaser mode
Views display with ability to pass arguments
Inline content - this is maybe similar to the old Panels "custom content panes"? But maybe something like storage entities could work here...
Sections - Sections have children that compose any of the above as well as other sections.
I think the context part of this is important for lookups, caching, etc... If a field gets deleted for example, a lookup by field properties should update this object. This could perhaps be flattened as an all encompassing array on the object to avoid deep nested queries within the json?

Here's a rough idea of what I was thinking. Maybe its not at all what is being considered but just thought id throw it out there as to me it makes sense to work out the data part first and build ux around that. I don't know that I would build in the props part as in css styles into this but rather maybe wrapping elements, classes, ids would be sufficient...

{
  "data": {
    "components": [
      {
        "type": "field",
        "context": {
          "type": "node",
          "bundle": "article",
          "name": "image",
          "id": 1,
          "view_mode": "teaser"
        }
      },
      {
        "type": "entity",
        "context": {
          "type": "block",
          "bundle": "basic",
          "id": 1,
          "view_mode": "default"
        }
      },
      {
        "type": "view",
        "context": {
          "id": "who_s_online",
          "display": "who_s_online_block",
          "args": null
        }
      },
      {
        "type": "section",
        "context": {
          "name": "section_1",
          "label": "Section 1"
        },
        "children": [
          {
            "type": "field",
            "context": {
              "type": "node",
              "bundle": "article",
              "name": "image",
              "id": 1,
              "view_mode": "full"
            }
          },
          {
            "type": "entity",
            "context": {
              "type": "block",
              "bundle": "basic",
              "id": 1,
              "view_mode": "teaser"
            }
          },
          {
            "type": "section",
            "context": {
              "name": "section_1_1",
              "label": "Section 1.1"
            },
            "children": [
              {
                "type": "view",
                "context": {
                  "id": "recent_articles",
                  "display": "recent_articles_block",
                  "args": [1]
                }
              },
              {
                "type": "view",
                "context": {
                  "id": "related_authors",
                  "display": "related_authors_block",
                  "args": [1, 2, 3]
                }
              }
            ]
          }
        ]
      }
    ]
  }
}

Comment about 1 year ago →
🇷🇴Romania amateescu
Quoting @catch in #71:

My only (or at least main) objection is to storing things that are entity content as static props instead of field-backed, something which has already been implemented and is also being explicitly argued for in this issue and elsewhere.

This is also my main concern with the JSON-based storage proposed in this issue, and I'd really like us to arrive to a solution that stores most* data in individual entity fields. The goal of removing the indirection layer that Layout Builder and Paragraphs have with referenced entities is very much worthwhile, but going into the direction of storing everything in a blob is not much of an improvement IMO.. quite the contrary.

* By "most" I mean that the majority usage pattern for building 'layout pages' is to fill data for a predefined list of components (at least in my experience), while one-off components like the "Christmas hero" example mentioned above are the exception, and I hope we can find a way to store that data without having it "dictate" the entire storage model.
Comment about 1 year ago →
🇧🇪Belgium wim leers Ghent 🇧🇪🇪🇺
* By "most" I mean that the majority usage pattern for building 'layout pages' is to fill data for a predefined list of components (at least in my experience) […]

Ah, that really explains it! 😄

I met a few hours ago with @lauriii, and he's got a great example that shows that that is not always the case. He will record a video where he walks you through that example (including visual aids), and I think that'll bring much more clarity to this conversation 😊 Stay tuned!
Comment about 1 year ago →
🇧🇪Belgium wim leers Ghent 🇧🇪🇪🇺
Please see https://wimleers.com/xb-week-5 — in which I essentially apologize for the chaos and give you some insight into how we got to this place. 🙈

Me in #52:

I'm working on a diagram next that to visualize the mental map I constructed based on months of conversations with @lauriii, @effulgentsia and others. Think https://www.drupal.org/docs/8/api/render-api/the-drupal-8-render-pipeline → , but for XB 🤓

→ 📌 [PP-1] Diagram tying the product requirements + decisions together Postponed , results: https://git.drupalcode.org/project/experience_builder/-/tree/0.x/docs/di...

Me in #52:

Most importantly: I also still think the trajectory XB is currently on is the simplest path to get something to work.

+ @catch in #57

The information that was absent from this issue but has been an important influence for having static props is now captured in full detail at 🌱 [META] 7. Content type templates — aka "default layouts" — clarify the tree+props data model Active . That, combined with the visual I referred to in #76 that @lauriii will soon add to that issue, should help us align here 🤞😊
Comment about 1 year ago →
🇺🇸United States michaellander
What about providing an alternative to fields for many props in SDC's(and elsewhere) that really have no need to be full fledged fields(or at least wrapped in a single json field)? Fields are great for content, but when I think of low-code style modifiers(alignment, min width, max width, heading weights, levels, etc, etc) having a field for each is very heavy handed and slow to build out for front enders/builders, and doesn't provide a great admin interface without a lot of work. Perhaps letting fields focus on content and allowing display/style attributes to be simplified would alleviate some of the bloat.
Comment about 1 year ago →
🇬🇧United Kingdom catch
Discussed this with @lauriii @alexpott and @longwave and I think we were able to clarify some of the various positions here.

@lauriii pointed out that the JSON:API representation of an entity will have the structured fields (let's call them 'fixed' in this context) out of order, but with 'static props' (i.e. content stored in the JSON structure) then any 'loose' value will appear in the order that the content author added them. i.e. if there are 15 text paragraphs with an image in the middle that's what would come out naturally.

I am still concerned about the 'data lifecycle' aspects of this (i.e. the student testimonials example), but it's clearer to me what the trade-offs are between the two models now.

I think my 'lifecycle' concerns can be mitigated a fair bit by the following.

1. We should make sure single fields, field union, and static props are all available equally, so that site builders are able and encouraged to create 'field backed' components where they make sense, and don't default to static props because they're 'easier'. People could still make mistakes in that case, but at least the tools are available to them. This means that rather than 'JSON static props' vs. 'field union backed' models it's JSON static props and field union backed.

2. I opened 📌 Calculate field and component dependencies on save and store them in an easy to retrieve format Active in the experience builder issue queue to try to simplify the dependency tracking logic that is in there currently.

This needs a better write up and probably an issue summary update but trying to at least get some of that discussion documented for now.
Comment about 1 year ago →
🇬🇧United Kingdom catch
Realised something else which invalidates #79 a bit.

It is very similar to the 'testimonial' problem described above but much more fundamental. Might have a solution bridging things too though, or at least something towards one.

Both core search module and search_api module rely on view modes so that site builders can decide how content gets rendered into the search index. For core search module it hard codes the 'search' view mode, for search_api the view mode is configurable.

The search view mode almost always wants to have all of the field content that the default view mode does (body field especially), but without elements that would 'pollute' the search results like field labels, related content blocks, CTAs, social widgets etc. so that if I search for "newsletter" or "Facebook" I don't get every article on the site back in the results, but only the ones with content actually mentioning newsletters or Facebook.

This means the search view mode is 99.9% of the time going to want to render both the 'fixed' and 'loose' content that is entered for the default view mode.

There is a very similar problem if you wanted to show for example the 'full' content of the entity in a newsletter subscription via an e-mail view mode - simplenews does this I think. RSS is another one where this data needs to be available.

il, we'd still need to support this concept of configuring/accessing 'loose' data from the default view mode in different view modes.
I'm pretty sure this means we can't escape/avoid the idea of view modes needing to pick out content from 'loose slots' from other view modes (or least from the default view mode) - and in this case it's not just one or two fields like the testimonials example but pretty much everything in it. And I think we should avoid traversing nested array structures to do this.

@lauriii mentioned at devdays that the other tools that do similar things to xb don't have the concept of multiple view mode configuration at all. All configuration/page building happens on a single default view mode. Anything like a card or teaser mode would be done via a react widget consuming the API (like a custom entity template more than a display mode). They don't have this problem to deal with in the same way.

The combination of the devdays discussion and this problem may have given me an idea.

The main objection to the 'field union only' data model is that it results in a field union field per component. And the field deltas for that field wouldn't semantically match the order of data on the page. e.g. without consulting the layout, how would you know a video is placed in the middle of five text field deltas? And if it's a figure/illustration that is important.

The reason there has to be a field instance per field union is because the field union data structure is relational - each value in a column which you can't mix and match.

This brings us back to JSON again but potentially in a different way.

When configuring view modes at the entity type/bundle level, there will be a concept of the slot that can hold 'loose' content. This means we know in advance that there will be some loose content or not, and how many slots, depending on the view mode.

Instead of a single JSON field storing both the layout configuration and the loose content in a tree, and instead of multiple field union fields as well as the tree, we could instead create a 'JSON field union' field type. This one field would allow multiple 'field union types' to be used and they just get stored as field deltas. This is in #3440578-15: JSON-based data storage proposal for component-based page building → (where I called it 'Layout builder + Alex B's JSON combo field'.

The field-union-json field would be a multi-value field, and it would have an instance per-view-mode-with-loose-slots. Props for each component become a delta of this field, it would also need to store which 'field union' it is for each delta which could just be a varchar or a key in the JSON structure. Possibly could store the component used here too.

This means that in both the 'testimonial' case above and with search view modes, it would be possible to reference the values of this field from different view modes (and render them using different components if necessary).

Let's say I have an entity with two loose content sections in the default, between them is a newsletter signup component that is fixed. When I render the entity for the newsletter view mode or for search indexing, I want to show all of the loose field content, but not the newsletter signup (because it would look out of place in the newsletter itself and would pollute the search index).

I can do this in the newsletter/search view mode configuration by adding a 'fixed' component that renders all of the content of the 'loose' field from the default view mode with whichever components were used. (Head exploding emoji). This could either render the field content sequentially, or potentially even use the tree structure too in some cases.

In the testimonials case, in my teaser view mode I could add a component and choose 'the first testimonial delta from the loose field from the default view mode', but importantly here in this case I can ignore the original component used and use a different one (e.g. just student name and text with no image).

By having one 'loose' field union json field for each slot that allows it, in the rare case that two view modes both have loose sections, it would keep the data separate (but still accessible from a third view mode like search in extreme cases) so we always store data contextually and don't reintroduce the interleaving problem.

There are some limitations here but hopefully good limitations:

1. The 'loose' field should only be used when you want to store actual content, not purely presentational data. So presentational components like a side by side would need to stay in the tree and reference the field-union-json field delta. If we don't do this, it implies nesting within the JSON and we'd loose the benefits of the flatter structure. Storing the presentational tree and the field content separately is the most complicated thing here but hopefully makes a lot of other things simpler.

2. Views would only be able to render this field (or individual deltas), not sort or filter or argument with it, this is how it should be and I think resolves one of Wim's concerns. If you want to sort or filter you are in 'real field' territory.

3. For the less common 'testimonials' case, we'd need to support finding deltas of only a certain union type - i.e. render the first component of type x or all components of type y. But this wouldn't require traversing a nested array to do so just some iteration.

4. It's not clear to me whether we need one loose field per view mode or one loose field per loose slot. It feels like one field per view mode is enough if we have to track deltas anyway.

@lauriii mentioned an edge case but apparently supported in other systems xb requirement that the tree structure allows which I think this would continue to cover: if you had one component with static props, then another component that wanted to reference data from those static props. In this case the first one would be field union content and the second one would only be in the layout tree.

Not sure how solid this solution is, but the search issue does convince me this is a real problem with the 'everything in the tree' approach.
Comment about 1 year ago →
🇺🇸United States apmsooner
This is seemingly getting overly complex to me in terms of what I understood was the desire. I don't know why search indexing would need to have anything to do with layout (ie; presentation) for an entity. Let search do its thing as is and just provide the means for laying parts (fields) of the entity, views or other entities via context similar to how panels worked. I don't understand the idea of static props, loose fields, etc... Why not just have a designated generic entity type to replace paragraphs that can only be used in layouts so you don't have some weird separate block management administration. Keep content as fields and if people are smart, they build out their own field_union type fields or whatever else the way they want for optimized data storage. The json storage part would just be accountable for the layout structure via sections and the elements composed within identifiable via ids and such so updates outside the layout update/delete accordingly to where they are referenced in the json object. I posted an example https://www.drupal.org/project/drupal/issues/3440578#comment-15640468 ✨ JSON-based data storage proposal for component-based page building Active of what sort of made sense to me at least... I think cache tags would need to be accounted for whatever is rendered in the layout too? Again, panels did all this stuff and more but had the same challenges of storing all this in serialized blob. To me it makes sense to model some of this previous contrib work if we can replace with json storage and improve the layout flexibility with wrapping sections, sections within sections (e.g. like mini panels did). We need standardized data structure too not only for rendering but for apis (rest/graphQL) for decoupled sites. Again, the idea of new things like static props, loose fields, SDC concerns me about standardized json structure to be useful for future endeavors. I can see people wanting to customize layout component forms and such that some of this other stuff just wouldn't seem to allow in comparison to fields.
Comment 12 months ago →
🇬🇧United Kingdom longwave UK
I am perhaps too late for this to be changed but regarding the "expression" syntax where we reference into the JSON structure to retrieve fields, is this an existing implementation of something or was this invented here? There is a spec called JSON Pointer with a PHP implementation that allows you to store a reference to a deeply nested JSON structure - wondering if we should use something standard here if possible in order to be more interoperable in future, ie. if we need to parse these expressions in both PHP and JavaScript.
Comment 12 months ago →
🇦🇺Australia acbramley
Quoting @catch from Slack:

I think we need to try to summarize some of the different models in that issue in the issue summary with pros and cons as well as documenting the use cases a bit better.

Big +1 for that :)
Comment 12 months ago →
🇬🇧United Kingdom catch
I've made a start on the issue summary, including a short requirements section as well as three of the options discussed here. Lots more to do but hopefully enough we can keep going from there.
Comment 11 months ago →
🇺🇸United States effulgentsia
as an optimization, don't actually store the JSON itself in these field values, but instead have a separate lookup table that maps a ... hash ... to the JSON value ... This way, duplicate values across revisions only duplicate the hash, not the full JSON.

Comment #14 asked to split that out into a separate issue. I finally got around to opening that: ✨ Add way to "intern" large field item values to reduce database size by 10x to 100x for sites with many entity revisions and/or languages Active . No code yet written for it, I only filed the issue.
Comment 9 months ago →
🇺🇸United States effulgentsia
@catch, I, and others met a few days ago to discuss option #3 in the proposed resolution. I included that in an XB refactoring proposal in 📌 Refactor the XB field type to be multi-valued, to de-jsonify the tree, and to reference the field_union type of the prop values Active .
Status changed to RTBC about 1 month ago7:23pm 22 May 2025
Comment about 1 month ago →
🇬🇧United Kingdom catch
📌 [PP-1] Consider not storing the ComponentTreeStructure data type as a JSON blob Postponed just landed which implements something very close to #3440578-80: JSON-based data storage proposal for component-based page building → / option #3 in the issue summary.

Since that just landed today, it's obviously very new in practice, but I think that means we can move this issue to RTBC now.
Comment about 1 month ago →
🇬🇧United Kingdom catch
Actually this is probably better postponed on 📌 Refactor the XB field type to be multi-valued, to de-jsonify the tree, and to reference the field_union type of the prop values Active which is tracking the related issues in experience builder.
Comment about 1 month ago →
🇧🇪Belgium wim leers Ghent 🇧🇪🇪🇺
Reflecting that #3477428 is itself PP-1 🤓

Captured in the meta → as of #3520449-31: [META] Production-ready data storage → . 👍

JSON-based data storage proposal for component-based page building

Problem/Motivation

Proposed resolution

Data model changes

Remaining tasks

Comments & Activities

How does that simplify tree structures?