ViewsData should not cache by language

Created on 9 August 2023, over 1 year ago
Updated 12 December 2023, about 1 year ago

Problem/Motivation

Discovered in πŸ“Œ Add a cache prewarm API and use it to distribute cache rebuids after cache clears / during stampedes Needs work .

Views data has to be cached by language, because some code calls t() when building strings. It's also one of the last examples of a giant arrayPI info hook.

Steps to reproduce

Proposed resolution

Add a new value object, with an interface, that implements ArrayAccess for backwards compatibility, pseudocode follows:

From views_views_data()

Before:

 $data['views']['entity_' . $entity_type_id] = [
        'title' => t('Rendered entity - @label', ['@label' => $label]),
        'help' => t('Displays a rendered @label entity in an area.', ['@label'  => $label]),
        'area' => [
          'entity_type' => $entity_type_id,
          'id' => 'entity',
        ],
      ];
    }

After:

 $data['views']['entity_' . $entity_type_id] = new ViewsDataEntityLabel($entity_type_id);
class ViewsDataEntityLabel extends ArrayAccess implements ViewsDataInterface {
// ....... stuff
}

Remaining tasks

User interface changes

API changes

Data model changes

Release notes snippet

πŸ› Bug report
Status

Needs work

Version

11.0 πŸ”₯

Component
ViewsΒ  β†’

Last updated 1 day ago

Created by

πŸ‡¬πŸ‡§United Kingdom catch

Live updates comments and jobs are added and updated live.
  • Needs manual testing

    The change/bugfix cannot be fully demonstrated by automated testing, and thus requires manual testing in a variety of environments.

Sign in to follow issues

Comments & Activities

  • Issue created by @catch
  • Status changed to Needs review over 1 year ago
  • last update over 1 year ago
    29,941 pass, 1 fail
  • πŸ‡³πŸ‡±Netherlands Lendude Amsterdam
  • πŸ‡¬πŸ‡§United Kingdom catch

    Here's a patch.

  • Status changed to Needs work over 1 year ago
  • πŸ‡¬πŸ‡§United Kingdom catch

    OK that's good, just test assumptions need to change.

  • πŸ‡¬πŸ‡§United Kingdom catch
  • Status changed to Needs review over 1 year ago
  • last update over 1 year ago
    29,958 pass
  • πŸ‡³πŸ‡±Netherlands Lendude Amsterdam

    This strips all the :en in the test and makes it go green locally.

  • Status changed to RTBC over 1 year ago
  • πŸ‡ΊπŸ‡ΈUnited States neclimdul Houston, TX

    Its green, makes sense, and the patch looks clean so RTBC. This is all cache and service stuff so the level of BC provided makes sense to me and safe.

    I think because of the class name the idea initially had some warning bells ringing for me so I did some digging to see if there was something we where missing and now I actually think this is correct and long overdue. I just don't like the name of this class now. 🀣

    In case it helps other reviewers, the backstory I found digging through git logs looks something like this.

    The original intent is lost to time and maybe Merlin's memory because this dates back to Views 6.x-2.x where it was added in one giant TGGM commit consolidating broken-ish CVS history. Somewhere in CVS you could have maybe pieced it together but its hard to say.

    Even without that initial commit(or direct access to Merlin's brain), there is a lot that can be gleamed from the 6 and 7 implementations. Prior to 8 this was part of the function views_cache_set function which optionally added the language code. My informed guess from working in the views/panels ecosystem is this is a generic cache wrapper designed to make it easier to write both raw data and rendered data in a views-centric way and originally you probably only would want to opt in to using the langcode for render caches to improve cache hits. But also, it probably became obvious pretty quickly not including it for "raw" data often broke because it would include translated string. Then early d8 work at including views probably logically decided to just always add the language as the only safe option.

    At the same time this was being ported into d8, TranslatableMarkup was solving the translated strings DI problems especially for things like plugin definitions that needed to be cached and serialized. Also views was moving all its render caches into render cache(or maybe that already happened?) meaning this was no longer a generic cache wrapper. Today, now that both of those solutions are firmly in place, and as demonstrated in the IS markup is safely serialized, this class's addition of the language code should serve no purpose.

  • last update over 1 year ago
    29,958 pass
  • πŸ‡¨πŸ‡­Switzerland berdir Switzerland

    Are you sure this also works with dynamic elements, views data includes field configuration labels for example? I would be very surprised if our tests cover that.

    We initially also didn't have language in the entity field definitions cache but had to add it back again and still have some leftover bugs there.

  • Status changed to Needs work over 1 year ago
  • πŸ‡¨πŸ‡­Switzerland berdir Switzerland

    Setting to needs work to test that, with the patch:

    Install umami, make sure a field like Body or something else has a label translation to another language for example on the node edit form. add it if not. Then go to the views UI for both languages, make sure it shows up in the correct language also when you add body as a field to a view in both languages.

  • πŸ‡³πŸ‡±Netherlands Lendude Amsterdam

    @Berdir is right, that breaks with the patch applied, shame.

  • πŸ‡¬πŸ‡§United Kingdom catch

    That is very annoying, although less annoying than committing this and finding out weeks later that we broke it.

    Two thoughts:

    1. We could use test coverage here.
    2. Is it worth opening an issue to add ConfigurationTranslatableMarkup or something like that which would store config object + key and get the translated value on demand?

    I tried to find the issue that Berdir mentioned and found a couple.

  • πŸ‡©πŸ‡ͺGermany donquixote

    2. Is it worth opening an issue to add ConfigurationTranslatableMarkup or something like that which would store config object + key and get the translated value on demand?

    We can introduce a range of objects with the purpose of providing language-specific strings, but which can be cached in a cross-language cache. But this solves only part of the problem.

    For BC reasons, we still have to support hooks that provide language-specific values, and that don't use these objects.

    This gets especially frustrating with alter hooks:
    One record of views data is potentially modified by multiple alter hooks, and each of them could introduce values that need caching by language.

    I was going to suggest a key 'needs_cache_by_language' in the views data, with the default being TRUE.
    But this does not really work with the alter hooks, because one alter hook does not know if the other alter hooks will add or already have added language-specific data.

    An alternative could be to introduce new hooks that are language-neutral.
    Over time, modules should convert to the new hooks.

    I don't like it much, because the current list of hook names is alread confusing enough.

    On the other hand: If we are willing to break BC and do this in a new major version, then maybe it is time to drop or rewrite or replace this views data system altogether..

  • πŸ‡©πŸ‡ͺGermany donquixote

    I was going to suggest a key 'needs_cache_by_language' in the views data, with the default being TRUE.

    If we are ok with a bit of BC breaking, we could make this key required, or default to FALSE.
    Then hooks that insert language-specific data need to declare this explicitly, otherwise it will be cached across languages.

  • πŸ‡¬πŸ‡§United Kingdom catch

    I had an idea of a value object with methods that would return the strings, replacing arrays, we could use proper methods but also ArrayAccess for bc:

    From views_views_data()

    Before:

     $data['views']['entity_' . $entity_type_id] = [
            'title' => t('Rendered entity - @label', ['@label' => $label]),
            'help' => t('Displays a rendered @label entity in an area.', ['@label'  => $label]),
            'area' => [
              'entity_type' => $entity_type_id,
              'id' => 'entity',
            ],
          ];
        }
    

    After:

     $data['views']['entity_' . $entity_type_id] = new ViewsDataEntityLabel($entity_type_id);
    
    class ViewsDataEntityLabel extends ArrayAccess implements ViewsDataInterface {}
    

    That way the actual strings would be generated when the explicit methods (or array access methods) are called on runtime, no caching required.

    If we did that, once we'd converted core, we could then issue a deprecation when something doesn't implement ViewsDataInterface.

    The array access would maintain (mostly) bc for alter hooks in the meantime.

  • πŸ‡©πŸ‡ͺGermany donquixote

    The views data contains nested objects.
    So we would have a top-level object for the table / entity type, and then dedicated objects per field, relationship etc.
    In fact currently relationships are sub-records under the respective field.

    Some alter hooks target the specific sub-records.
    E.g. image_field_views_data_views_data_alter() adds a relationship to an existing field.

      $data['file_managed'][$pseudo_field_name]['relationship'] = [...]
    

    So yeah we could use the objects with ArrayAccess BC layer.
    But we still have to determine if we run this for every language or just once.
    If we don't run it for every language, then all implementations that provide language-specific strings need updating.

    Another idea could be plugins for views data - if we already use objects.
    But this has the same problem - if we convert all the core views data to plugins, then will the alter implementations in contrib still work?

  • πŸ‡¬πŸ‡§United Kingdom catch

    Wasn't 100% sure whether nested ArrayAccess would actually work with alters, but seems like it might: https://www.php.net/manual/en/class.arrayaccess.php

    If we don't run it for every language, then all implementations that provide language-specific strings need updating.

    Yes this is what I was thinking, but if can issue a deprecation when the definitions don't implement ArrayAccess we could eventually enforce that in a major release.

  • πŸ‡©πŸ‡ͺGermany donquixote

    There are some operations that are not supported for ArrayAccess.
    One is the union operator +.
    Then all the functions with explicit array type checks.
    https://3v4l.org/v1GqI
    (ArrayObject is a default implementation for ArrayAccess)

    Another thing is that arrays, if passed by value, can be modified without a side effect, whereas with objects you change the properties of the original original object.
    This can lead to some hard to detect bugs if they are replaced by ArrayAccess.

    Also, if the data objects mostly mimick arrays, and don't encapsulate any behavior besides the fetching of entity labels or field labels, then I would prefer to have regular arrays, with objects only for the specific fields that have language-specific values. These single-purpose classes will also be easier to reuse.
    On the other hand, if we implement a lot of the values as "behavior" of an object, then it becomes harder for alter hooks to replace specific values.

  • πŸ‡¬πŸ‡§United Kingdom catch

    There are some operations that are not supported for ArrayAccess.
    One is the union operator +.
    Then all the functions with explicit array type checks.
    https://3v4l.org/v1GqI
    (ArrayObject is a default implementation for ArrayAccess)

    This is true but I think it's unlikely that modules will be doing advanced array implementations on views_data output, and what gets fed to alter hooks is considered @internal, so it it works for 99% of alter hooks out there, it would be OK if a few fail - but to me it's quite likely that zero would fail.

    then I would prefer to have regular arrays, with objects only for the specific fields that have language-specific values.
    

    I think that could work, and it would be less effort, so something more like:

    'title' => new ViewsDataEntityTypeTitle($entity_type_id)

    And then we'd need to trigger deprecations when these are strings instead of objects implementing a specific class eventually.

  • πŸ‡¬πŸ‡§United Kingdom catch
Production build 0.71.5 2024