String token values get returned as markup casted back to string

Created on 10 July 2023, 12 months ago
Updated 12 June 2024, 16 days ago

Problem/Motivation

If there is a node field of type plain text, which contains e.g. a & b and is being used as a token, e.g. [node:field_test:value], then the token system returns a & b.

Steps to reproduce

Create a node with a field test and create a node which contains a & b in the test field. Then run drush php and execute this script:

<?php

use Drupal\node\Entity\Node;

/** @var \Drupal\eca\Token\TokenServices $token */
$token = \Drupal::service('eca.token_services');
$node = Node::load(1); // <= replace the ID with your node ID.

$token->addTokenData('entity', $node);
$result = $token->replace('Hallo [entity:field_test:value] and good bye.');
print($result);

The output is a &amp; b.

Proposed resolution

This happens because \Drupal\Core\Utility\Token::doReplace is being called by \Drupal\Core\Utility\Token::replace with $markup = TRUE. That converts the found string value into new HtmlEscapedText($value) which will later be casted back to string during str_replace and that converts all special characters into HTML entities.

If doReplace were called by \Drupal\Core\Utility\Token::replacePlain instead, then no markup would get involved and everything would work as expected.

Two fun facts: nothing in Drupal ever calls replacePlain except a unit test. But if ECA's TokenDecoratorTrait were calling replacePlain instead of replace, everything would be working as expected.

As tokens in ECA are used to grab values to use them elsewhere and not to be rendered, the expected behaviour is to get the same value without that being altered. So, ECA should call replacePlain, I suspect.

📌 Task
Status

Fixed

Version

2.0

Component

Code

Created by

🇩🇪Germany jurgenhaas Gottmadingen

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

  • Issue created by @jurgenhaas
  • Open in Jenkins → Open on Drupal.org →
    Core: 10.0.7 + Environment: PHP 8.1 & MySQL 8
    last update 12 months ago
    289 pass
  • Status changed to Needs review 12 months ago
  • 🇩🇪Germany jurgenhaas Gottmadingen
  • Status changed to RTBC 12 months ago
  • I have looked to this issue, MR!375 is resolving the issue as replacePlain do not contain Markup Involvement so, it doesn't convert special char to html markups, hence works as expected, Adding before after screenshots for reference..

  • Status changed to Needs review 12 months ago
  • 🇩🇪Germany jurgenhaas Gottmadingen

    Thanks @kushagra.goyal for your review. Just to let you know, providing screenshots of before and after is not helpful and TBH not wanted by module maintainers. If you describe what you've done to review a patch, and what the outcome was, there is no need to proof that with screenshots.

    That said, what we need for this issue to move forward is to discuss the consequences of this change. Is it the right thing to do here? And is it safe to use a core function that's not used anywhere else? Is there a risk that this function would disappear in a future version of Drupal core?

  • 🇩🇪Germany mxh Offenburg

    But if ECA's TokenDecoratorTrait were calling replacePlain instead of replace, everything would be working as expected.

    "Everything" refers to a certain use case you're looking at. For other cases that are covered so far, there are different expectations, and they expect the token system to behave how it behaves right now.

    The token service is being used outside of ECA as well, and this change would break those components. It should be obvious, because Token::replace and Token::replacePlain are two public methods with different purposes as documented by their method comments. Their different purposes are also highlighted here: https://www.drupal.org/node/3232491

    As a conclusion, changing the default replacement behavior by enforcing replacePlain within replace is certainly not the right way. Therefore, this is more likely a support request rather than a bug.

    Rendering the contents of a plain text field is expected to be automatically escaped, as its contents may come from an untrusted source. When you want to have un-escaped values, you'd need to transform it accordingly. Right now I think you can do the following:

    • Use "Token: set value" using only the token that referes to the desired field value (see attached screenshot) and use that new token in your successors. That is supposed to work, because that action will set a new value wrapped as trusted markup.
    • In case your field value may contain HTML tags that you don't want to include, you could additionally strip out the tags using "Tamper: strip tags" afterwards (this action is provided by the eca_tamper module)

    Apart from that options, we could also consider supporting the newish replacePlain method. It could be an additonal option in the new "Token: replace" action, a completely new action "Token: replace plain" or a flagging action that sets a runtime flag for the token service to use replacePlain as long as the flag is set.

  • Status changed to Active 12 months ago
  • 🇩🇪Germany mxh Offenburg
  • 🇩🇪Germany jurgenhaas Gottmadingen

    That's what I had in mind too, thanks for confirming this.

    I'm just not quite convinced about the expectation bit. Imagine someone uses the "Compare two scalar values" condition where they compare [entity:field_text:value] with the string Terms & Conditions. My expectation is that this returns true, if the field value is Terms & Conditions, but it will return false because the token replacement gives us Terms &amp; Conditions which is not the same as the string.

    Also, working with tokens in the ECA context, I never expected any rendering. So, "Rendering the contents of a plain text field is expected to be automatically escaped" might be correct in the context of tokens being used when preparing content for output. But within ECA, most use cases seem to be data manipulation and comparison, not rendering.

    That said, you're absolutely right that the token behaviour outside ECA context needs to remain unaffected. The trouble is, that within the ECA context, some processing may rely on Drupal's default token behaviour while other tasks don't. To resolve this issue, users have to be given control over how things are handled.

    We may be lucky though: as we've now tagged all fields, that support tokens, with '#eca_token_reference' => TRUE,, we could automatically add a checkbox to each of those fields that indicates, if that field should be replaced in plain or in rendering mode.

  • 🇩🇪Germany mxh Offenburg

    We have many rendering aspects in ECA, for example

    • System action for Displaying a message to the user
    • System action for Sending an e-mail
    • ECA Render sub-module
    • ...

    Imagine someone uses the "Compare two scalar values" condition where they compare [entity:field_text:value] with the string Terms & Conditions. My expectation is that this returns true, if the field value is Terms & Conditions, but it will return false because the token replacement gives us Terms & Conditions which is not the same as the string.

    In scope of comparisons, it might be fine to automatically use the plain method instead. Maybe even for conditons in general. But I haven't thought about that thoroughly, whether that might have any negative consequence.

    we could automatically add a checkbox to each of those fields that indicates, if that field should be replaced in plain or in rendering mode.

    Not sure whether users will be overwhelmed by this. Probably almost no one will understand this. Every text input supporting tokens, each one having addtional option, sounds cumbersome.

  • 🇩🇪Germany jurgenhaas Gottmadingen

    Not sure whether users will be overstrained by this. Probably almost no one will understand this.

    There is already a lot about tokens, that users have trouble to understand. We need to address that with documentation. But then, I don't see why that aspect shouldn't be understandable.

    Every text input supporting tokens, each one having addtional option, sounds cumbersome.

    Still, it sounds as if this is the only way to give users the necessary control. Or is there an alternative solution that doesn't require extra actions for each other action that makes use of tokens?

  • 🇩🇪Germany danielspeicher Steisslingen

    As far as I understand, we just have the two reasonable options:

    Add a checkbox or create a new action.

    I would go for the checkbox, because it is much easier to maintain the model, the action is used in. And yes, there might be a lack of understanding what this option does. So we must assure, that we write tests for this case and document it accordingly.

  • 🇦🇹Austria SherlockHomeless

    I had this problem too, quite recently, and the way I worked around it, was to use the "Tamper: decode HTML" action from the ECA Tamper module after a form submission event.

  • 🇩🇪Germany mxh Offenburg

    There is another option: Introducing a new [plain:*] token prefix, which can then make use of the replacePlain method. That token prefix can be easily defined via regular token info and replacement hooks, and does not require any upgrade path. Plus it can be additionally documented via token browser. For example [entity:field_test:value] would be instead used as [plain:entity:field_test:value].

  • 🇩🇪Germany jurgenhaas Gottmadingen

    This idea sounds great @mxh. Could that be implemented by simply checking for the token name to start with plain: and then removing that prefix and leaving the rest to the regular processing?

  • 🇩🇪Germany mxh Offenburg

    Could that be implemented by simply checking for the token name to start with plain: and then removing that prefix and leaving the rest to the regular processing?

    No special handling required. It would be a new token type "plain", which would be handled as usual by a regular token hook.

  • 🇩🇪Germany jurgenhaas Gottmadingen

    Oh, even better. I just have no idea how to implement that. Would you have time to provide an MR for that?

  • Assigned to mxh
  • 🇩🇪Germany jurgenhaas Gottmadingen

    Tagging for back port as we got approached by several users who thought this was a bug. This is debatable, but if in any way possible, we should consider back porting this when we have a solution.

    Assigning to you @mxh, please undo if you can't get around to tackling this.

  • Open in Jenkins → Open on Drupal.org →
    Core: 10.1.x + Environment: PHP 8.1 & MySQL 8
    last update 11 months ago
    296 pass
  • @mxh opened merge request.
  • Issue was unassigned.
  • Status changed to Needs review 11 months ago
  • 🇩🇪Germany mxh Offenburg
  • Status changed to RTBC 11 months ago
  • 🇩🇪Germany jurgenhaas Gottmadingen

    This looks great and works as expected.

    If you want to backport this, you need to set minimum Drupal core version requirement to 9.5, because prior core versions don't have a Token::replacePlain method.

    Thanks a lot for the heads-up, I think this is an acceptable "limitation" by this time in the release cycle of Drupal 9.

  • Open in Jenkins → Open on Drupal.org →
    Core: 10.1.x + Environment: PHP 8.1 & MySQL 8
    last update 11 months ago
    297 pass
  • 🇩🇪Germany jurgenhaas Gottmadingen

    Added tests.

  • Open in Jenkins → Open on Drupal.org →
    Core: 10.1.x + Environment: PHP 8.1 & MySQL 8
    last update 11 months ago
    297 pass
  • Open in Jenkins → Open on Drupal.org →
    Core: 10.1.x + Environment: PHP 8.1 & MySQL 8
    last update 11 months ago
    297 pass
  • Status changed to Fixed 11 months ago
  • 🇩🇪Germany jurgenhaas Gottmadingen

    Merged and back ported. Tests are passing.

  • Automatically closed - issue fixed for 2 weeks with no activity.

  • Pipeline finished with Failed
    16 days ago
    Total: 412s
    #197246
Production build 0.69.0 2024