- Issue created by @jurgenhaas
- Status changed to Postponed: needs info
5 months ago 2:42pm 13 October 2024 - 🇳🇱Netherlands megachriz
There are more Tamper plugins on which data is processed in validation/submission, for example:
- aggregate
- convert_boolean
- find_replace_multiline
- find_replace_regex
I don't think that we should store the config exactly as inputted on the form for the keyword filter plugin. We could however add public methods that can process the information outside of the form context. Would that work for you?
Sorry for the late reply, by the way, but I had put most Tamper issues on hold for a long time in order to focus fully on Feeds.
- 🇳🇱Netherlands megachriz
Note: in 🐛 Keyword filter: variable executed as function, security issue? Active I proposed to move some parts indeed to the
tamper()
method. At least, it proposes to remove the function from the configuration and determine that when tampering instead. Perhaps the processing of words should also take place intamper()
. - 🇩🇪Germany jurgenhaas Gottmadingen
Thanks a lot @megachriz for coming back to this.
We could however add public methods that can process the information outside of the form context. Would that work for you?
Yes, absolutely. So, there should probably be a new method in the tamper interface, e.g.
processConfigValues()
which is empty for most plugins. The form validate and/or submit methods would move all their processing into that, but that same method could also be called right before the actual tampering.That would certainly work for us, yes.
I haven't had time to look at 🐛 Keyword filter: variable executed as function, security issue? Active yet, but from your comment it sounds more like what I proposed in this issue before too. So, that would be even better in my view.
- 🇳🇱Netherlands megachriz
I've merged 🐛 Keyword filter: variable executed as function, security issue? Active which should fix part of this issue.
Here I went further by removing all config processing in
validateConfigurationForm()
. Now that method only validates. There is still a bit of config processing left insubmitConfigurationForm()
and that is that the list of words as inputted on the form is converted to an array. For example:"Foo\n\Bar\nQux"
is saved as['Foo', 'Bar', 'Qux']
.Summary of changes:
- Deprecated 'words' setting;
- Removed 'word_list' setting;
- Added 'words_list' setting as a replacement for 'words' and 'word_list';
- Determine regex to use when applying the Tamper plugin and not when saving the config;
- Added backwards compatibility for the 'words' setting.
I drafted a change record that also includes the changes from 🐛 Keyword filter: variable executed as function, security issue? Active :
https://www.drupal.org/node/3485191 →@jurgenhaas
Would this work for you? Or would you still need something likeprocessConfigValues()
, basically just for converting"Foo\n\Bar\nQux"
to['Foo', 'Bar', 'Qux']
? - 🇩🇪Germany jurgenhaas Gottmadingen
@megachriz thanks a lot, I've looked into this and ECA is calling the tamper plugin's submit method before storing the configuration into the ECA config entity. So, this way we're able to receive the correct data. And as the keyword filter plugin keeps both, the words as a string and the word_list as a list, editing the configuration later should still work based on the original string that contains all the words.
- 🇳🇱Netherlands megachriz
@jurgenhaas
Note that the "words" setting (where words are saved as string) is only respected for backwards compatibility. When (re)saving the configuration, the setting will become an empty string.Example
Before (configuration in the old format, from a feed type):
words: |- foo bar word_boundaries: true exact: false case_sensitive: false invert: false word_list: - /\bfoo\b/ui - /\bbar\b/ui regex: true function: matchRegex uuid: 81082ad2-4c8a-44c8-aeff-2969eb3f9612 plugin: keyword_filter source: qux weight: 0 label: 'Keyword filter'
After (configuration in the new format, from a feed type):
words: '' words_list: - foo - bar word_boundaries: true exact: false case_sensitive: false invert: false uuid: 81082ad2-4c8a-44c8-aeff-2969eb3f9612 plugin: keyword_filter source: qux weight: 0 label: 'Keyword filter'
- 🇩🇪Germany jurgenhaas Gottmadingen
Hmm, if that's the plan, then the build form function needs some enhancement. Because that form is meant to edit die configuration and it uses the
word
settings. That would then always be an empty string. Converting the list back to words isn't simple either.So, for that reason, why not keeping the original
word
settings. Is there any downside to that? - 🇳🇱Netherlands megachriz
My reasoning for no longer use the "words" setting:
- I think that it is better to save a list of words only once, instead of having it in two formats.
- The
tamper()
method did not use the "words" setting (now it could use it, but only for backwards compatibility and only if the new setting "words_list" is empty). - The
tamper()
method wants to have an array of words.
Converting the list back to words isn't simple either.
buildConfigurationForm()
easily converts an array of words to something usable for a form:'#default_value' => implode("\n", $this->getWordList()),
So upon submitting the form, the words inputted in a textarea are converted to an array. And when loading the form, the array is then converted back to a string.
@jurgenhaas
Does this cause any issues for you? Would you still need aprocessConfigValues()
method? - 🇩🇪Germany jurgenhaas Gottmadingen
Sure, if converting back is a no-brainer, then I would also avoid redundant data storage.
And no, there is no need for a processConfigValue method any longer if everything is handled this way. I'd like to give this another test if you already have the build form adjusted.
- 🇳🇱Netherlands megachriz
Yes, the build form already converts an array of words to a string that is usable for the textarea. I put this back to "Needs review", so you can give it another test. Thanks in advance. 🙂
Perhaps good to know: in 🐛 Make Keyword Filter to skip an item instead of emptying the value Active I plan to make another change to the Keyword Filter, namely that it will throw a SkipTamperItemException instead of emptying a value. This is consistent with how the Required plugin works, which also can throw that type of exception, so I expect it would not cause new issues.
But I plan to make that change after this issue is done, because it would else cause merge conflicts. - 🇩🇪Germany jurgenhaas Gottmadingen
This is now working OK, after I had to resolve another eca_tamper issue at 🐛 Properly handle arrays in tamper plugin config Active to properly handle the array in the config values.
There is only 1 thing that we can't support with this approach. Sorry, I forgot to mention that before: if the configuration of the plugin is done by using variables (token in the case of ECA), they will only be known at runtime, not when configuring the model. That can not be done now, because the processing of the word list happens during config, not during runtime.
Example, if the word list were defined like this:
[mywordlist]
The value behind this token could be a string or an array, but only known during runtime.
Another example:
myfirstword [myplaceholder] mythirdword etc
That could be the configured string for
WORD
.We could probably resolve that by replacing tokens at that point and then update the plugin config. So, maybe not an issue for here. Just left the details so that you can see where we're coming from.
-
megachriz →
committed 2418cd06 on 8.x-1.x
Issue #3358226 by megachriz, jurgenhaas: Refactored Keyword Filter...
-
megachriz →
committed 2418cd06 on 8.x-1.x
- 🇳🇱Netherlands megachriz
Thanks for following up so quickly! I merged the changes.
Thanks for giving some background as well. At first I thought about if it would be useful to add token support to Tamper plugins, but since the Tamper module would be missing context (it doesn't get entities passed for example), maybe token replacement is better handled externally. I think updating/altering the config used at runtime is good enough, as long as you don't store that updated config. But I assume that you have handled that correctly.
I see that in 🐛 Properly handle arrays in tamper plugin config Active there is a special case for the "find_replace_regex" Tamper plugin. I wonder if it would be a good idea to use a different data type in the config schema for tamper.find_replace_regex.find, but on https://www.drupal.org/docs/drupal-apis/configuration-api/configuration-... → I didn't see a logical subtype of string to use. We could define our own subtype and call it "regex" for example. This is unrelated to this issue, but a thing that crossed my mind.
- 🇩🇪Germany jurgenhaas Gottmadingen
Amazing, thanks for addressing and fixing this. And yes, it's the best choice not to get into token handling in tamper ;-)
As for the regex edge case, I don't think it's worth introducing a new type. This is just a scenario where a regex string could easily look like it contains a token syntax, where it isn't. And that would normally remove that part from the regex string, if that token didn't exist.
Automatically closed - issue fixed for 2 weeks with no activity.