Exclusion of individual user agents

Created on 7 July 2023, over 1 year ago

Problem/Motivation

Microsoft Office 365 (Outlook) users complain that the one-time login links arrive, but they are no longer valid when clicked. After closer analysis of the server logs, it was found that Microsoft/Bing crawlers/bots are very aggressive and ALWAYS call the link and thus invalidate it.

Steps to reproduce

Proposed resolution

Implement configuration field where it is possible to exclude own custom user agents. Reason for this is that not all desired user agents are included by CrawlerDetect.

Remaining tasks

User interface changes

API changes

Data model changes

Feature request
Status

Fixed

Version

2.0

Component

Code

Created by

🇩🇪Germany zcht

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

  • Issue created by @zcht
    • zcht committed ccb3e1ba on 2.0.x
      Issue #3373315: event subscriber 'ShyEventSubscriber' adapted to the new...
    • zcht committed f712a3bd on 2.0.x
      Issue #3373315: ShyOneTimeState interface and check routine implemented
      
    • zcht committed c418c16f on 2.0.x
      Issue #3373315: new custom user agent field implemented via Drupal state...
    • zcht committed 242b562c on 2.0.x
      Issue #3373315: old logic for 'passwordless' module removed
      
    • zcht committed b4041163 on 2.0.x
      Issue #3373315: documentation updated for version 2.x in README.md
      
    • zcht committed 266c1eee on 2.0.x
      Issue #3373315: Minimum PHP version and Drupal version increased
      
  • Status changed to Fixed over 1 year ago
  • 🇩🇪Germany zcht

    CrawlerDetect as a large library already works very well. After internal testing it was found that not all crawlers/bots are covered. Especially users working with Microsoft Office 365 and therefore Outlook noticed very often that login is not possible. Upon closer analysis, it was found that the MS/Bing crawlers are particularly persistent and repeatedly call the reset links, regardless of server configuration or the like. For this reason, a text field was implemented in the backend via the Drupal State API, in which selected user agents (always one per line) can be entered. These are checked by 'Shy One Time', in case of a hit a redirect to the LogIn form with a 302 status code occurs, the reset link is not invalidated.

    Furthermore, logging has been implemented in dblog, which logs ALL user agents coming via the route 'user.reset'. So it can be tracked exactly which crawler is causing problems and this can be taken over into the custom user agent configuration. This way the evaluation of the server logs is not necessary, but can be used additionally for verification.

    The additional support for the 'passwordless' module has been removed. From now on a generic solution is followed via the internal route 'user.reset', so supports all modules that access this route.

    ---
    After installing the module in version 2.x, the configuration interface can be reached via the link /admin/config/system/shy_one_time. User agents that are unwanted and should be blocked are entered in the text field. Only ONE user agent may be inserted per line. For more information, see the README.md.
    ---

    • zcht committed 0650ba3b on 2.0.x
      Issue #3373315: Do not log request without user agents
      
      ... Some user...
    • zcht committed 02c743bc on 2.0.x
      Issue #3373315: isMasterRequest() is deprecated
      
      ... isMasterRequest...
  • Automatically closed - issue fixed for 2 weeks with no activity.

Production build 0.71.5 2024