ViewsData should not cache by language

Issue created by @catch
Status changed to Needs review almost 2 years ago8:15am 9 August 2023
Open in Jenkins → Open on Drupal.org →
Environment: PHP 8.2 & MySQL 8
last update almost 2 years ago
29,941 pass, 1 fail
Comment almost 2 years ago →
🇳🇱Netherlands Lendude Amsterdam
catch → credited Lendude → .
Comment almost 2 years ago →
🇬🇧United Kingdom catch
Here's a patch.
Comment almost 2 years ago →
System Message
The last submitted patch, 3: 3380145.patch, failed testing. View results →
Status changed to Needs work almost 2 years ago9:45am 9 August 2023
Comment almost 2 years ago →
🇬🇧United Kingdom catch
OK that's good, just test assumptions need to change.
Comment almost 2 years ago →
🇬🇧United Kingdom catch
Status changed to Needs review almost 2 years ago2:41pm 10 August 2023
Open in Jenkins → Open on Drupal.org →
Environment: PHP 8.2 & MySQL 8
last update almost 2 years ago
29,958 pass
Comment almost 2 years ago →
🇳🇱Netherlands Lendude Amsterdam
This strips all the :en in the test and makes it go green locally.
Status changed to RTBC almost 2 years ago7:39pm 10 August 2023
Comment almost 2 years ago →
🇺🇸United States neclimdul Houston, TX
Its green, makes sense, and the patch looks clean so RTBC. This is all cache and service stuff so the level of BC provided makes sense to me and safe.

I think because of the class name the idea initially had some warning bells ringing for me so I did some digging to see if there was something we where missing and now I actually think this is correct and long overdue. I just don't like the name of this class now. 🤣

In case it helps other reviewers, the backstory I found digging through git logs looks something like this.

The original intent is lost to time and maybe Merlin's memory because this dates back to Views 6.x-2.x where it was added in one giant TGGM commit consolidating broken-ish CVS history. Somewhere in CVS you could have maybe pieced it together but its hard to say.

Even without that initial commit(or direct access to Merlin's brain), there is a lot that can be gleamed from the 6 and 7 implementations. Prior to 8 this was part of the function views_cache_set function which optionally added the language code. My informed guess from working in the views/panels ecosystem is this is a generic cache wrapper designed to make it easier to write both raw data and rendered data in a views-centric way and originally you probably only would want to opt in to using the langcode for render caches to improve cache hits. But also, it probably became obvious pretty quickly not including it for "raw" data often broke because it would include translated string. Then early d8 work at including views probably logically decided to just always add the language as the only safe option.

At the same time this was being ported into d8, TranslatableMarkup was solving the translated strings DI problems especially for things like plugin definitions that needed to be cached and serialized. Also views was moving all its render caches into render cache(or maybe that already happened?) meaning this was no longer a generic cache wrapper. Today, now that both of those solutions are firmly in place, and as demonstrated in the IS markup is safely serialized, this class's addition of the language code should serve no purpose.
Open in Jenkins → Open on Drupal.org →
Environment: PHP 8.2 & MySQL 8
last update almost 2 years ago
29,958 pass
Comment almost 2 years ago →
🇨🇭Switzerland berdir Switzerland
Are you sure this also works with dynamic elements, views data includes field configuration labels for example? I would be very surprised if our tests cover that.

We initially also didn't have language in the entity field definitions cache but had to add it back again and still have some leftover bugs there.
Status changed to Needs work almost 2 years ago9:09am 14 August 2023
Comment almost 2 years ago →
🇨🇭Switzerland berdir Switzerland
Setting to needs work to test that, with the patch:

Install umami, make sure a field like Body or something else has a label translation to another language for example on the node edit form. add it if not. Then go to the views UI for both languages, make sure it shows up in the correct language also when you add body as a field to a view in both languages.
Comment almost 2 years ago →
🇳🇱Netherlands Lendude Amsterdam
@Berdir is right, that breaks with the patch applied, shame.
Comment almost 2 years ago →
🇬🇧United Kingdom catch
That is very annoying, although less annoying than committing this and finding out weeks later that we broke it.

Two thoughts:

1. We could use test coverage here.
2. Is it worth opening an issue to add ConfigurationTranslatableMarkup or something like that which would store config object + key and get the translated value on demand?

I tried to find the issue that Berdir mentioned and found a couple.
Comment almost 2 years ago →
🇩🇪Germany donquixote
2. Is it worth opening an issue to add ConfigurationTranslatableMarkup or something like that which would store config object + key and get the translated value on demand?

We can introduce a range of objects with the purpose of providing language-specific strings, but which can be cached in a cross-language cache. But this solves only part of the problem.

For BC reasons, we still have to support hooks that provide language-specific values, and that don't use these objects.

This gets especially frustrating with alter hooks:
One record of views data is potentially modified by multiple alter hooks, and each of them could introduce values that need caching by language.

I was going to suggest a key 'needs_cache_by_language' in the views data, with the default being TRUE.
But this does not really work with the alter hooks, because one alter hook does not know if the other alter hooks will add or already have added language-specific data.

An alternative could be to introduce new hooks that are language-neutral.
Over time, modules should convert to the new hooks.

I don't like it much, because the current list of hook names is alread confusing enough.

On the other hand: If we are willing to break BC and do this in a new major version, then maybe it is time to drop or rewrite or replace this views data system altogether..
Comment almost 2 years ago →
🇩🇪Germany donquixote
I was going to suggest a key 'needs_cache_by_language' in the views data, with the default being TRUE.

If we are ok with a bit of BC breaking, we could make this key required, or default to FALSE.
Then hooks that insert language-specific data need to declare this explicitly, otherwise it will be cached across languages.
Comment almost 2 years ago →
🇬🇧United Kingdom catch
I had an idea of a value object with methods that would return the strings, replacing arrays, we could use proper methods but also ArrayAccess for bc:

From views_views_data()

Before:

$data['views']['entity_' . $entity_type_id] = [ 'title' => t('Rendered entity - @label', ['@label' => $label]), 'help' => t('Displays a rendered @label entity in an area.', ['@label' => $label]), 'area' => [ 'entity_type' => $entity_type_id, 'id' => 'entity', ], ]; }
After:

$data['views']['entity_' . $entity_type_id] = new ViewsDataEntityLabel($entity_type_id);
class ViewsDataEntityLabel extends ArrayAccess implements ViewsDataInterface {}
That way the actual strings would be generated when the explicit methods (or array access methods) are called on runtime, no caching required.

If we did that, once we'd converted core, we could then issue a deprecation when something doesn't implement ViewsDataInterface.

The array access would maintain (mostly) bc for alter hooks in the meantime.
Comment almost 2 years ago →
🇩🇪Germany donquixote
The views data contains nested objects.
So we would have a top-level object for the table / entity type, and then dedicated objects per field, relationship etc.
In fact currently relationships are sub-records under the respective field.

Some alter hooks target the specific sub-records.
E.g. image_field_views_data_views_data_alter() adds a relationship to an existing field.

$data['file_managed'][$pseudo_field_name]['relationship'] = [...]
So yeah we could use the objects with ArrayAccess BC layer.
But we still have to determine if we run this for every language or just once.
If we don't run it for every language, then all implementations that provide language-specific strings need updating.

Another idea could be plugins for views data - if we already use objects.
But this has the same problem - if we convert all the core views data to plugins, then will the alter implementations in contrib still work?
Comment almost 2 years ago →
🇬🇧United Kingdom catch
Wasn't 100% sure whether nested ArrayAccess would actually work with alters, but seems like it might: https://www.php.net/manual/en/class.arrayaccess.php

If we don't run it for every language, then all implementations that provide language-specific strings need updating.

Yes this is what I was thinking, but if can issue a deprecation when the definitions don't implement ArrayAccess we could eventually enforce that in a major release.
Comment almost 2 years ago →
🇩🇪Germany donquixote
There are some operations that are not supported for ArrayAccess.
One is the union operator +.
Then all the functions with explicit array type checks.
https://3v4l.org/v1GqI
(ArrayObject is a default implementation for ArrayAccess)

Another thing is that arrays, if passed by value, can be modified without a side effect, whereas with objects you change the properties of the original original object.
This can lead to some hard to detect bugs if they are replaced by ArrayAccess.

Also, if the data objects mostly mimick arrays, and don't encapsulate any behavior besides the fetching of entity labels or field labels, then I would prefer to have regular arrays, with objects only for the specific fields that have language-specific values. These single-purpose classes will also be easier to reuse.
On the other hand, if we implement a lot of the values as "behavior" of an object, then it becomes harder for alter hooks to replace specific values.
Comment almost 2 years ago →
🇬🇧United Kingdom catch
There are some operations that are not supported for ArrayAccess.
One is the union operator +.
Then all the functions with explicit array type checks.
https://3v4l.org/v1GqI
(ArrayObject is a default implementation for ArrayAccess)

This is true but I think it's unlikely that modules will be doing advanced array implementations on views_data output, and what gets fed to alter hooks is considered @internal, so it it works for 99% of alter hooks out there, it would be OK if a few fail - but to me it's quite likely that zero would fail.

then I would prefer to have regular arrays, with objects only for the specific fields that have language-specific values.
I think that could work, and it would be less effort, so something more like:

'title' => new ViewsDataEntityTypeTitle($entity_type_id)

And then we'd need to trigger deprecations when these are strings instead of objects implementing a specific class eventually.
Comment over 1 year ago →
🇬🇧United Kingdom catch

ViewsData should not cache by language

Problem/Motivation

Steps to reproduce

Proposed resolution

Remaining tasks

User interface changes

API changes

Data model changes

Release notes snippet

Comments & Activities