- πΊπΈUnited States tr Cascadia
OK, I opened a MR because I was intending to try to re-roll this patch against the current 2.2.x development branch, but there are all sorts of problems.
I don't think I can make any progress on this because a lot of it is no longer relevant, and the remaining pieces are going to need a lot more discussion because they will definitely change the behavior of this module.
So the two pieces that I believe are no longer needed are:
- Protection for programmatically submitted forms is already turned off - no need to do it again in this patch. See #2677126: Honeypot should not be triggered on programmatically submitted forms. β
- Webform now does its own
hook_form_alter()
so I don't think there is any need for the Webform-specific code changes.
Then there is the replacement of the regular expressions with the check to see if the path is an admin path. I don't really think that does the same thing as the regular expression. But that stuff isn't well documented so I did some research:
The current patterns for form_ids to exclude from Honeypot protection are:/[^a-zA-Z]system_/ /[^a-zA-Z]search_/ /[^a-zA-Z]views_exposed_form_/
- Initially this was just
'system_'
, which was added in #1253480: Drush calls to enable or disable modules raise time limit warnings β - Then
'search_'
was added by π Disable Honeypot on search forms Fixed - Then
'views_exposed_form_'
was added by #1498940: Views exposed auto submit filter doesn't work with Honeypot β - And then the simple string matching was converted to regular expressions in #2912575: From with 'research_' in machine name is not protected β .
- (In fact this current issue was opened as a follow up to #2912575: From with 'research_' in machine name is not protected β )
My proposal in #10 was to use:
$route = \Drupal::routeMatch()->getRouteObject(); $is_admin = \Drupal::service('router.admin_context')->isAdminRoute($route);
instead of the regular expressions, which does identify admin paths in D8+ (and I think it's good to exclude those), but I don't think this serves the same purpose as the regular expressions above.
The patch in #7 suggests:
// Exclude admin forms, with the exception of node forms. $is_system_form = path_is_admin(current_path()) && ($base_form_id !== 'node_form'); if (empty($form_state['programmed']) && !$is_system_form && !in_array($base_form_id, $excluded_forms)) {
which is also quite different than the regular expressions above. And it adds 'programmed' forms which are not needed as I said because they are handled elsewhere. And it adds 'node_form' which should be discussed.
What is the current justification for the use of base_path?
So yes, I do want to get rid of that regexp, but we should be VERY clear and explicit about what we intend to exclude and why. And that means at a minimum writing a test to make sure the correct paths are excluded and nothing else. If we still need to exclude those same forms for the same reasons they're there for, then checking if it's an admin path isn't good enough. A search form can appear on a non-admin page, so if we want to exclude search forms for instance we have to do more than check if the path is an admin patch.
And unless I'm totally mistaken, the above regular expressions are just plain WRONG. Here's a test script:
<?php $form_ids = [ '/admin/config/system/cron' => 'system_cron_settings', '/admin/modules/uninstall' => 'system_modules_uninstall', '/node/5/edit' => 'node_article_edit_form', '/node/5' => 'comment_form', '/planets' => 'solarsystem_form', '/moons' => 'solar_system_form', ]; foreach ($form_ids as $path => $form_id) { // Regular expressions from honeypot.module. if (preg_match('/[^a-zA-Z]system_/', $form_id) === 0 && preg_match('/[^a-zA-Z]search_/', $form_id) === 0 && preg_match('/[^a-zA-Z]views_exposed_form_/', $form_id) === 0) { print_r("Honeypot protection will be added to $path \n"); } }
My expectation is that only the first two should be excluded from Honeypot protection. Instead, they ALL are excluded, EXCEPT for the last one. This test tells me that the regular expression DOESN'T EXCLUDE what it means to exclude. It's just wrong.
I really don't see how this could have been so broken for so long (six years!) so I would like a sanity check from the community before I totally redo the regular expressions and replace them with something more sensible.